NAME
Sort::DataTypes - Sort a list of data using methods relevant to the type
of data
SYNOPSIS
use Sort::DataTypes qw(:all);
DESCRIPTION
This module allows you to sort a list of data elements using methods
that are relevant to the type of data contained in the list. This
modules does not attempt to be the fastest sorter on the block. If you
are sorting thousands of elements and need a lot of speed, you should
refer to a module specializing in the specific type of sort you will be
doing. However, to do smaller sorts of different types of data, this is
the module to use.
ROUTINES
All sort routines are named sort_METHOD where METHOD is the name of the
method. All sort_METHOD have both a forward and reverse sort:
sort_METHOD(\@list,@args);
sort_rev_METHOD(\@list,@args);
where @args are any additional arguments needed for that sort method.
Corresponding to every sort_METHOD routine is a cmp_METHOD routine which
takes two elements (and possibly additional arguments as required by the
actual method) and returns a -1, 0, or 1 (similar to the cmp or <=>
operators).
$flag = cmp_METHOD($x,$y,@args);
$flag = cmp_rev_METHOD($x,$y,@args);
All sort_METHOD functions can also be used to sort a list using a hash:
sort_METHOD(\@list,[@args],\%hash);
sort_rev_METHOD(\@list,[@args],\%hash);
The elements of @list are sorted. Each element in @list must be a key in
%hash, and the value of that key must be of the appropriate type. The
elements of @list are sorted by using the cmp_METHOD function to compare
the values in the hash.
For example, if %hash contains the key/value pairs:
foo => 3
bar => 5
ick => 1
and @list contains (foo,bar,ick), then sorting:
sort_numerical(\@list,%hash)
=> @list = (ick,foo,bar)
since "ick" corresponds to a numerical value of 1, "foo" to 3, and "bar"
to 5.
sort_valid_method
cmp_valid_method
use Sort::DataTypes qw(:all)
$flag = sort_valid_method($string);
$flag = cmp_valid_method($string);
These are identical and return 1 if there is a valid sort method
named $string in the module. For example, there is a function
"sort_numerical" defined in this modules, but there is no function
"sort_foobar", so the following would occur:
sort_valid_method("numerical")
=> 1
sort_valid_method("foobar")
=> 0
Note that the methods must NOT include the "sort_" or "cmp_" prefix.
sort_by_method
cmp_by_method
use Sort::DataTypes qw(:all)
sort_by_method($method,\@list [,@args]);
cmp_by_method ($method,$ele1,$ele2 [,@args]);
These sort a list, or compare two elements, using the given method
(which is any string which returns 1 when passed to
sort_valid_method. @args are arguments to pass to the sort.
If the method is not valid, the list is left untouched.
sort_numerical
sort_rev_numerical
cmp_numerical
cmp_rev_numerical
use Sort::DataTypes qw(:all)
sort_numerical(\@list);
sort_rev_numerical(\@list);
sort_numerical(\@list,\%hash);
sort_rev_numerical(\@list,\%hash);
$flag = cmp_numerical($x,$y);
$flag = cmp_rev_numerical($x,$y);
These sorts a list numerically in forward or reverse order, or
compare two elements numerically. There is little reason to use
either of these routines (it would be more efficient to simply call
sort as:
sort { $a <=> $b } @list
but they are included for the sake of completeness (and for use by
the sort_by_method/cmp_by_method routines). Also, if the code is
being automatically generated, numerical sorts won't have to be a
special case.
sort_alphabetic
sort_rev_alphabetic
cmp_alphabetic
cmp_rev_alphabetic
use Sort::DataTypes qw(:all)
sort_alphabetic(\@list);
sort_rev_alphabetic(\@list);
sort_alphabetic(\@list,\%hash);
sort_rev_alphabetic(\@list,\%hash);
$flag = cmp_alphabetic($x,$y);
$flag = cmp_rev_alphabetic($x,$y);
These do alphabetic sorts. As with numerical sorts, there is little
reason to call these, and they are included for the sake of
completeness.
sort_length
sort_rev_length
cmp_length
cmp_rev_length
use Sort::DataTypes qw(:all)
sort_length(\@list);
sort_rev_length(\@list);
sort_length(\@list,\%hash);
sort_rev_length(\@list,\%hash);
$flag = cmp_length($x,$y);
$flag = cmp_rev_length($x,$y);
These take strings and compare them by length and alphabetically if
they are the same length.
sort_ip
sort_rev_ip
cmp_ip
cmp_rev_ip
use Sort::DataTypes qw(:all)
sort_ip(\@list);
sort_rev_ip(\@list);
sort_ip(\@list,\%hash);
sort_rev_ip(\@list,\%hash);
$flag = cmp_ip($x,$y);
$flag = cmp_rev_ip($x,$y);
These sort/compare IP numbers. Each value can be a pure IP (in the
form A.B.C.D) or a CIDR notation which includes the netmask
(A.B.C.D/MASK).
When comparing CIDR representations, if the IP part of two elements
is identical, the following two rules are used:
an element without a mask comes before one that has a mask
two elements with masks are sorted by mask
So the following elements are in sorted order:
10.20.30.40 < 10.20.30.40/4 < 10.20.30.40/16
sort_domain
sort_rev_domain
cmp_domain
cmp_rev_domain
use Sort::DataTypes qw(:all)
sort_domain(\@list [,$sep]);
sort_rev_domain(\@list [,$sep]);
sort_domain(\@list, [$sep,] \%hash);
sort_rev_domain(\@list, [$sep,] \%hash);
$flag = cmp_domain($x,$y [,$sep]);
$flag = cmp_rev_domain($x,$y [,$sep]);
These sort domain names (foo.bar.com) or anything else consisting of
a class, subclass, subsubclass, etc., with the most significant
class at the right (i.e. subsubclass.subclass.class).
Each element in the list is split into subvalues. Subvalues in a
domain are separated from each other by a period (.) by default, but
this can be overridden. If $sep is passed in, it is a regular
expression to split the values into subvalues.
Since the most significant subvalue in the domain is at the right,
any domain ending with ".com" would come before any domain ending in
".edu".
a.b < z.b < a.bb < z.bb < a.c
sort_numdomain
sort_rev_numdomain
cmp_numdomain
cmp_rev_numdomain
use Sort::DataTypes qw(:all)
sort_numdomain(\@list [,$sep]);
sort_rev_numdomain(\@list [,$sep]);
sort_numdomain(\@list, [$sep,] \%hash);
sort_rev_numdomain(\@list, [$sep,] \%hash);
$flag = cmp_numdomain($x,$y [,$sep]);
$flag = cmp_rev_numdomain($x,$y [,$sep]);
A related type of sorting is numdomain sorting. This is identical to
domain sorting except that if two elements in the domain are
integers, numerical sorts will be done. So:
a.2.c < a.11.c
It should be noted that if a field may be either numeric or
alphanumeric, sorting with this method may yield unexpected results.
For example, sorting the three elements:
a.1.b
a.2.b
a.X.b
will use numeric comparisons when comparing the 2nd field of the
first and second elements, but it will use alphabetic comparisons
when comparing the first and third elements (or the second and third
elements).
sort_path
sort_rev_path
cmp_path
cmp_rev_path
use Sort::DataTypes qw(:all)
sort_path(\@list [,$sep]);
sort_rev_path(\@list [,$sep]);
sort_path(\@list, [$sep,] \%hash);
sort_rev_path(\@list, [$sep,] \%hash);
$flag = cmp_path($x,$y [,$sep]);
$flag = cmp_rev_path($x,$y [,$sep]);
This sorts paths (/A/B/C...) or anything else consisting of a class,
subclass, subsubclass, etc., with the most significant class at the
left.
Elements in a path (or classes, subclasses, etc.) are separated from
each other by a slash (/) unless $sep is passed in. If $sep is
passed in, it is a regular expression to split the elements in a
path.
Since the most significant element in the domain is at the left, you
get the following behavior:
a/b < a/z < aa/b < aa/z < b/b
When sorting lists that have a mixture of relative paths and
explicit paths, the explicit paths will come first. So:
/b/c < a/b
sort_numpath
sort_rev_numpath
cmp_numpath
cmp_rev_numpath
use Sort::DataTypes qw(:all)
sort_numpath(\@list [,$sep]);
sort_rev_numpath(\@list [,$sep]);
sort_numpath(\@list, [$sep,] \%hash);
sort_rev_numpath(\@list, [$sep,] \%hash);
$flag = cmp_numpath($x,$y [,$sep]);
$flag = cmp_rev_numpath($x,$y [,$sep]);
A related type of sorting is numpath sorting. This is identical to
path sorting except that if two elements in the path are integers,
numerical sorts will be done. So:
a/2/c < a/11/c
sort_random
sort_rev_random
cmp_random
cmp_rev_random
use Sort::DataTypes qw(:all)
sort_random(\@list);
sort_rev_random(\@list);
sort_random(\@list,\%hash);
sort_rev_random(\@list,\%hash);
$flag = cmp_random($x,$y);
$flag = cmp_rev_random($x,$y);
This uses the Fisher-Yates algorithm to randomly shuffle an array in
place. This routine was derived from the book
The Perl Cookbook
Tom Christiansen and Nathan Torkington
The sort_rev_random is identical, and is included simply for the
situation where the sort routines are being called in some
automatically generated code that may add the 'rev_' prefix.
The cmp_random simply returns a random -1, 0, or 1.
sort_version
sort_rev_version
cmp_version
cmp_rev_version
use Sort::DataTypes qw(:all)
sort_version(\@list);
sort_rev_version(\@list);
sort_version(\@list,\%hash);
sort_rev_version(\@list,\%hash);
$flag = cmp_version($x,$y);
$flag = cmp_rev_version($x,$y);
These sorts a list of version numbers of the form
MAJOR.MINOR.SUBMINOR ... (any number of levels are allowed). The
following examples should illustrate the ordering:
1.1.x < 1.2 < 1.2.x Numerical versions are compared first at
the highest level, then at the next highest,
etc. The first non-equal compare sets the
order.
1.a < 1.b Alphanumeric levels that start with a letter
are compared alphabetically.
1.2a < 1.2 < 1.03a Alphanumeric levels that start with a number
are first compared numerically with only the
numeric part. If they are equal, alphanumeric
levels come before purely numerical levels.
Otherwise, they are compared alphabetically.
1.a < 1.2a An alphanumeric level that starts with a letter
comes before one that starts with a number.
1.01a < 1.1a Two alphanumeric levels that are numerically
equal in the number part and equal in the
remaining part are compared alphabetically.
sort_date
sort_rev_date
cmp_date
cmp_rev_date
use Sort::DataTypes qw(:all)
sort_date(\@list);
sort_rev_date(\@list);
sort_date(\@list,\%hash);
sort_rev_date(\@list,\%hash);
$flag = cmp_date($x,$y);
$flag = cmp_rev_date($x,$y);
These sorts a list of dates. Dates are anything that can be parsed
with Date::Manip.
sort_line
sort_rev_line
cmp_line
cmp_rev_line
use Sort::DataTypes qw(:all)
sort_line(\@list,$n [,$sep]);
sort_rev_line(\@list,$n [,$sep]);
sort_line(\@list,$n, [$sep,] \%hash);
sort_rev_line(\@list,$n, [$sep,] \%hash);
$flag = cmp_line($x,$y,$n [,$sep]);
$flag = cmp_rev_line($x,$y,$n [,$sep]);
These take a list of lines and sort on the Nth field using $sep as
the regular expression splitting the lines into fields. Fields are
numbered starting at 0. If no $sep is given, it defaults to white
space.
sort_numline
sort_rev_numline
cmp_numline
cmp_rev_numline
use Sort::DataTypes qw(:all)
sort_numline(\@list,$n [,$sep]);
sort_rev_numline(\@list,$n [,$sep]);
sort_numline(\@list,$n, [$sep,] \%hash);
sort_rev_numline(\@list,$n, [$sep,] \%hash);
$flag = cmp_numline($x,$y,$n [,$sep]);
$flag = cmp_rev_numline($x,$y,$n [,$sep]);
These are similar but will sort numerically if the Nth field is an
integer, and alphabetically otherwise.
sort_function
sort_rev_function
cmp_function
cmp_rev_function
use Sort::DataTypes qw(:all)
sort_function(\@list,\&func);
sort_rev_function(\@list,\&func);
sort_function(\@list,\&func,\%hash);
sort_rev_function(\@list,\&func,\%hash);
$flag = cmp_function($x,$y,\&func);
$flag = cmp_rev_function($x,$y,\&func);
This is a catch-all sort function. It takes a reference to a
function suitable to compare two elements and return -1, 0, or 1
depending on the order of the elements.
BACKWARDS INCOMPATIBILITIES
The following are a list of backwards incompatibilities.
Version 2.00 handling of hashes
In version 1.xx, when sorting by hash, the hash was passed in as the
hash. As of 2.00, it is passed in by reference to avoid any
confusion with optional arguments.
KNOWN PROBLEMS
None at this point.
LICENSE
This script is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
AUTHOR
Sullivan Beck (sbeck@cpan.org)