Urutan leksikografik: Perbedaan antara revisi

Konten dihapus Konten ditambahkan
EmausBot (bicara | kontrib)
k Bot: Migrasi 13 pranala interwiki, karena telah disediakan oleh Wikidata pada item d:Q1144915
HsfBot (bicara | kontrib)
k Bot: perubahan kosmetika
 
(11 revisi perantara oleh 6 pengguna tidak ditampilkan)
Baris 1:
[[Berkas:Orderings; 6 choose 3.svg|jmpl|Orderings of the 3-[[subset]]s of <math>\scriptstyle \{1,...,6\}</math> (and the corresponding [[Binary numeral system|binary]] [[Vector (mathematics and physics)|vectors]])<br />When the (blue) triples are in ''lex'' order the (red) vectors are in ''revlex'' order, and vice versa. The arrangements on the right side show [[Colexicographical order|''colex'']] and ''revcolex'' order.]]
{{hatnote|Untuk nama yang mempunyai kemiripan dengan sistem pengurutan di luar bidang matematika, lihat [[Urutan Alfabet]] dan [[Collation]].}}
[[File:Orderings; 6 choose 3.svg|thumb|340px|Orderings of the 3-[[subset]]s of <math>\scriptstyle \{1,...,6\}</math> (and the corresponding [[Binary numeral system|binary]] [[Vector (mathematics and physics)|vectors]])<br>When the (blue) triples are in ''lex'' order the (red) vectors are in ''revlex'' order, and vice versa. The arrangements on the right side show [[Colexicographical order|''colex'']] and ''revcolex'' order.]]
Dalam [[matematika]], '''urutan leksikografik''', (biasa dikenal sebagai '''urutan leksikal''' atau '''urutan alfabet''', adalah bentuk umum dari [[urutan alfabet]] kata yang berdasarkan pada pengurutan huruf depan.
 
{{matematika-stub}}
==Definisi==
[[Kategori:Order theory]]
Diberikan dua [[himpunan terurut sebagian]] yaitu ''A'' dan ''B'',. Leksikografik [[teori pengurutan |urutan]] di dalam [[Cartesian product]] ''A'' &times; ''B'' is defined as
[[Kategori:Lexicography]]
:(''a'',''b'') &le; (''a''&prime;,''b''&prime;) if and only if ''a'' < ''a''&prime; or (''a'' = ''a''&prime; and ''b'' &le; ''b''&prime;).
 
The result is a partial order. If ''A'' and ''B'' are [[total order|totally ordered]], then the result is a total order as well.
 
More generally, one can define the lexicographic order on the Cartesian product of ''n'' ordered sets, on the Cartesian product of a countably infinite family of ordered sets, and on the union of such sets.
 
{{Merge from|Colexicographical order|date=May 2011}}
 
==Motivation and uses==
 
The name of the lexicographic order comes from its generalizing the order given to words in a [[dictionary]]: a sequence of letters (that is, a ''word'')
 
:''a''<sub>1</sub>''a''<sub>2</sub> ... ''a''<sub>''k''</sub>
 
appears in a dictionary before a sequence
 
:''b''<sub>1</sub>''b''<sub>2</sub> ... ''b''<sub>''k''</sub>
 
if and only if the first ''a<sub>i</sub>'', which is different from ''b<sub>i</sub>'', comes before ''b<sub>i</sub>'' in the [[alphabet]].
 
That comparison assumes both sequences are the same length. To ensure they are the same length, the shorter sequence is usually padded at the end with enough "blanks" (a special symbol that is treated as coming before any other symbol). This also allows ordering of phrases. For the purpose of dictionaries, etc., padding with blank spaces is always done. See [[alphabetical order]].
 
For example, the word "Thomas" appears before "Thompson" in dictionaries because the letter 'a' comes before the letter 'p' in the alphabet. The 5th letter is the first that is different in the two words; the first 4 letters are "Thom" in both. Because it is the first difference, the 5th letter is the most significant difference (for an alphabetical ordering).
 
A lexicographical ordering may not coincide with conventional alphabetical ordering. For example, the numerical order of [[Unicode]] codepoints does not always correspond to traditional alphabetic orderings of the characters, which vary from language to language. So the lexicographic ordering induced by codepoint value sorts strings in an unambiguous canonical order, but it does not necessarily "alphabetize" them in the conventional sense.
 
An important property of the lexicographical order is that it preserves [[well-order]]s, that is, if ''A'' and ''B'' are well-ordered sets, then the product set ''A'' &times; ''B'' with the lexicographical order is also well-ordered.
 
An important exploitation of lexicographical ordering is expressed in the [[ISO 8601]] date formatting scheme, which expresses a date as YYYY-MM-DD. This date ordering lends itself to straightforward [[sorting algorithm|computerized sorting]] of dates such that the sorting algorithm does not need to treat the numeric parts of the date string any differently from a string of non-numeric characters, and the dates will be sorted into chronological order. Note, however, that for this to work, there must always be four digits for the year, two for the month, and two for the day, so for example single-digit days must be padded with a zero yielding '01', '02', ..., '09'.
 
Another example of digits ordered lexicographically is 101,102,103,104,105,106,107,108,109,110,111,112... 200, 201, 202 etc.
 
Another generalization of lexical ordering occurs in [[social choice theory]] (the theory of elections). Consider an election in which there are 4 candidates A, B, C and D, each voter expresses a top-to-bottom ordering of the candidates, and the voters' orderings are as follows:
 
{| class="wikitable"
!18%
!17%
!33%
!32%
|-
|A
|B
|C
|D
|-
|B
|A
|D
|B
|-
|C
|C
|A
|A
|-
|D
|D
|B
|C
|}
 
The [[Minimax Condorcet|MinMax]] voting method is a simple [[Condorcet method]] that counts the votes as in a round-robin tournament (all possible pairings of candidates) and judges each candidate according to its largest "pairwise" defeat. The winner is the candidate whose largest defeat is the smallest. In the example:
*The largest defeat of A is by D: '''65%''' (33%+32%) rank D over A.
*The largest defeat of B is by D: '''65%''' (33%+32%) rank D over B.
*The largest defeat of C is by A (or B): '''67%''' (18%+17%+32%) rank A over C (and B over C).
*The largest defeat of D is by C: '''68%''' (18%+17%+33%) rank C over D.
 
MinMax declares a tie between A and B since the largest defeats for both are the same size, 65%. This is like saying "Thomas" and "Thompson" should be at the same position because they have the same first letter. However, if the defeats are compared lexically, we have the MinLexMax method. With MinLexMax, because the largest defeats of A and B are the same size, their next largest defeats are then compared:
*A's next largest defeat is '''0%'''. (This is a padding, since A has only one defeat.)
*B's next largest defeat is by A: '''51%''' (18%+33%) rank A over B.
Since B's next largest defeat is larger than A's, MinLexMax elects A, which makes more sense than the MinMax tie since a majority rank A over B.
 
Another usage in social choice theory is the [[Ranked Pairs]] voting method. Although usually defined by a procedure that constructs the order of finish, Ranked Pairs is equivalent to finding which of all possible orders of finish is best according to a minlexmax comparison of the majorities they reverse. In the example above, the Ranked Pairs order of finish is ABCD (which elects A). ABCD affirms the majorities who rank A over B, A over C, B over C and C over D, and reverses the majorities who rank D over A and D over B. The largest majority that ABCD reverses is 65%. The only other ordering that wouldn't reverse a larger majority is BACD (which also reverses 65%). ABCD is a better order of finish than BACD because the lexically relevant set of majorities—the majorities on which ABCD and BACD disagree—is {A over B} and BACD reverses the largest majority in this set.
 
==Case of multiple products==
 
Suppose
:<math>
\{ A_1, A_2, \cdots, A_n \}
</math>
is an n-tuple of sets, with respective total orderings
:<math>
\{ <_1, <_2, \cdots, <_n \}
</math>
 
The dictionary ordering
:<math>
\ \ <^{d}
</math>
of
:<math>
A_1 \times A_2 \times \cdots \times A_n
</math>
is then
:<math>
(a_1, a_2, \dots, a_n) <^d (b_1,b_2, \dots, b_n) \iff
(\exists\ m > 0) \ (\forall\ i < m) (a_i = b_i) \land (a_m <_m b_m)
</math>
 
That is, if one of the terms
:<math>
\ \ a_m <_m b_m
</math>
and all the preceding terms are equal.
 
Informally,
:<math>
\ \ a_1
</math>
represents the first letter,
:<math>
\ \ a_2
</math>
the second and so on when looking up a word in a dictionary, hence the name.
 
This could be more elegantly stated by recursively defining the ordering of any set
 
:<math>
\ \ C= A_j \times A_{j+1} \times \cdots \times A_k
</math>
 
represented by
:<math>
\ \ <^d (C)
</math>
 
This will satisfy
 
:<math>
a <^d (A_i) a' \iff (a <_i a')
</math>
 
:<math>
(a,b) <^d (A_i \times B) (a',b') \iff
a <^d (A_i) a' \lor ( a=a' \ \land \ b <^d (B) b')
</math>
 
where
<math>
B = A_{i+1} \times A_{i+2} \times \cdots \times A_n.
</math>
 
To put it more simply, compare the first terms. If they are equal, compare the second terms – and so on. The relationship between the first corresponding terms that are not equal determines the relationship between the entire elements.
 
==Groups and vector spaces==
If the component sets are [[ordered group]]s then the result is a non-[[Archimedean group]], because e.g. ''n''(0,1) < (1,0) for all ''n''.
 
If the component sets are [[ordered vector space]]s over '''R''' (in particular just '''R'''), then the result is also an ordered vector space.
 
==Ordering of sequences of various lengths==
 
Given a partially ordered set ''A'', the above considerations allow to define naturally a lexicographical partial order <math><^\mathrm{d}</math> over the [[free monoid]] ''A''* formed by the set of all [[finite sequence]]s of elements in ''A'', with sequence [[concatenation]] as the monoid operation, as follows:
 
:<math>u <^\mathrm{d} v</math> if
:* <math>u</math> is a [[prefix]] of <math>v</math>, or
:* <math>u=wau'</math> and <math>v=wbv'</math>, where <math>w</math> is the longest common prefix of <math>u</math> and <math>v</math>, <math>a</math> and <math>b</math> are members of ''A'' such that <math>a<b</math>, and <math>u'</math> and <math>v'</math> are members of ''A''*.
 
If < is a total order on ''A'', then so is the lexicographic order <<sup>d</sup> on ''A''*. If ''A'' is a finite and totally ordered alphabet, ''A''* is the set of all [[String (computer science)#Formal theory|words]] over ''A'', and we retrieve the notion of dictionary ordering used in lexicography that gave its name to the lexicographic orderings.
However, in general this is not a [[well-order]], even though it is on the alphabet ''A''; for instance, if ''A'' = {''a'', ''b''}, the [[Formal language|language]] {''a''<sup>''n''</sup>''b'' | ''n'' ≥ 0} has no least element: ... <<sup>d</sup> ''aab'' <<sup>d</sup> ''ab'' <<sup>d</sup> ''b''. A well-order for strings, based on the lexicographical order, is the [[shortlex order]].
 
Similarly we can also compare a finite and an infinite string, or two infinite strings.
 
Comparing strings of different lengths can also be modeled as comparing strings of infinite length by right-padding finite strings with a special value that is less than any element of the alphabet.
 
This ordering is the ordering usually used to order [[String (computer science)|character strings]], including in dictionaries and indexes.
 
=== Quasi-lexicographic order ===
The '''quasi-lexicographic order''' on the free monoid ''A''<sup>&lowast;</sup> over an ordered alphabet ''A'' orders strings firstly by length, so that the [[empty string]] comes first, and then within strings of fixed length ''n'', by lexicographic order on ''A''<sup>''n''</sup>.<ref>{{cite book | last=Calude | first=Cristian | authorlink=Cristian S. Calude | title=Information and randomness. An algorithmic perspective | series=EATCS Monographs on Theoretical Computer Science | publisher=[[Springer-Verlag]] | year=1994 | isbn=3-540-57456-5 | zbl=0922.68073 | page=1 }}</ref>
 
==Generalization==
Consider the set of functions ''f'' from a [[well-ordered set]] ''X'' to a [[totally ordered set]] ''Y''. For two such functions ''f'' and ''g'', the order is determined by the values for the smallest ''x'' such that ''f''(''x'') ≠ ''g''(''x'').
 
If ''Y'' is also well-ordered and ''X'' is finite, then the resulting order is a well-order. As already shown above, if ''X'' is infinite this is in general not the case.
 
If ''X'' is infinite and ''Y'' has more than one element, then the resulting set ''Y''<sup>''X''</sup> is not a [[countable set]], see also [[Cardinal number#Cardinal exponentiation|cardinal exponentiation]].
 
Alternatively, consider the functions ''f'' from an inversely well-ordered ''X'' to a well-ordered ''Y'' with minimum 0, restricted to those that are non-zero at only a finite subset of ''X''. The result is well-ordered. Correspondingly we can also consider a well-ordered ''X'' and apply lexicographical order where a higher ''x'' is a more significant position. This corresponds to [[Ordinal arithmetic#Exponentiation|exponentiation of ordinal numbers]] ''Y''<sup>''X''</sup>. If ''X'' and ''Y'' are countable then the resulting set is also countable.
 
==Monomials==
 
In algebra it is traditional to order [[term (mathematics)|terms]] in a [[polynomial]], by ordering the [[monomial]]s in the [[indeterminate]]s. This is fundamental, to have a [[normal form]]. Such matters are typically left implicit in discussion between humans, but must of course be dealt with exactly in [[computer algebra]]. In practice one has an alphabet of indeterminates ''X'', ''Y'', ... and orders all monomials formed from them by a variant of lexicographical order. For example if one decides to order the alphabet by
 
:''X'' < ''Y'' < ...
 
and also to look at higher terms first, that means ordering
 
: ... < ''X''<sup>3</sup> < ''X''<sup>2</sup> < ''X''
 
and also
 
: ''X'' < ''Y''<sup>''k''</sup> for all ''k''.
 
There is some flexibility in ordering monomials, and this can be exploited in [[Gröbner basis]] theory.
 
==Decimal fractions==
For [[Decimal#Decimal_fractions|decimal fractions]] from the decimal point, ''a'' < ''b'' applies equivalently for the numerical order and the lexicographic order, provided that [[0.999...|numbers with a recurring decimal 9]] like .399999... are not included in the set of strings representing numbers. With that restriction there is an order-preserving bijection between the strings and the numbers.
 
== Reverse lexicographic order ==
 
In a common variation of lexicographic order, one compares elements by reading from the right instead of from the left, i.e., the right-most component is the most significant, e.g. applied in a [[rhyming dictionary]].
 
In the case of monomials one may sort the exponents downward, with the exponent of the first base variable as primary sort key, e.g.:
: <math> x^2 y z^2 < x y^3 z^2 </math>.
Alternatively, sorting may be done by the sum of the exponents, downward.
 
== See also ==
 
* [[Collation]]
* [[Colexicographical order]]
* [[Kleene–Brouwer order]]
* [[Lexicographic preferences]]
* [[Total_order#Orders_on_the_Cartesian_product_of_totally_ordered_sets|Orders on the Cartesian product of totally ordered sets]]
* [[Ordered_vector_space#Examples|Lexicographic order on the '''R'''<sup>''n''</sup>]]
* [[Lexicographic order topology on the unit square]]
* [[Long line (topology)]]
* [[Product order]]
* [[Lyndon word]]
* [[Lexicographically minimal string rotation]]
 
==References==
{{reflist}}
 
[[Category:Order theory]]
[[Category:Lexicography]]