2411.04718v1
2411.04718v1
2411.04718v1
Abstract
arXiv:2411.04718v1 [cs.DS] 7 Nov 2024
We consider the problem of counting the copies of a length-k pattern σ in a sequence f : [n] → R,
where a copy is a subset of indices i1 < . . . < ik ∈ [n] such that f (ij ) < f (iℓ ) if and only if σ(j) < σ(ℓ).
This problem is motivated by a range of connections and applications in ranking, nonparametric statistics,
combinatorics, and fine-grained complexity, especially when k is a small fixed constant.
Recent advances have significantly improved our understanding of counting and detecting patterns.
Guillemot and Marx [2014] demonstrated that the detection variant is solvable in O(n) time for any fixed
k. Their proof has laid the foundations for the discovery of the twin-width, a concept that has notably
advanced parameterized complexity in recent years. Counting, in contrast, is harder: it has a conditional
lower bound of nΩ(k/ log k) [Berendsohn, Kozma, and Marx 2019] and is expected to be polynomially
harder than detection as early as k = 4, given its equivalence to counting 4-cycles in graphs [Dudek and
Gawrychowski, 2020].
In this work, we design a deterministic near-linear time (1 + ε)-approximation algorithm for counting
σ-copies in f for all k ≤ 5. Combined with the conditional lower bound for k = 4, this establishes the
first known separation between approximate and exact algorithms for pattern counting. Interestingly,
our algorithm leverages the Birgé decomposition – a sublinear tool for monotone distributions widely
used in distribution testing – which, to our knowledge, has not been applied in a pattern counting context
before.
∗ Department of Computer Science, Technion, Haifa, Israel. Supported by a Taub Family Foundation “Leaders in Science &
Technology” fellowship. Work conducted in part while the author was at MIT and later at the Simons Institute for the Theory
of Computing. Email: [email protected]
† UC Davis, CA, USA. Supported by the Google Research Scholar and NSF Faculty Early Career Development Program
No. 2340048. Part of this work was conducted while the author was visiting the Simons Institute for the Theory of Computing.
Email: [email protected].
‡ Massachusetts Institute of Technology, Cambridge, MA, USA. Email: [email protected]
Contents
1 Introduction 1
1.1 Our results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Our techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Leveraging the Birgé decomposition for monotonicity-based counting (Section 3) . . . 4
1.2.2 Imposing structure through separators for 4-patterns (Section 4) . . . . . . . . . . . . 5
1.2.3 Global separators for 5-patterns (Section 5) . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.4 A Primitive for Counting 12 Copies within Axis-Parallel Rectangles (Section 5.1) . . . 6
1.3 Open problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Preliminaries 8
2.1 Algorithmic primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.1 Segment trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.2 Birgé decomposition: Fast approximation of monotone sums . . . . . . . . . . . . . . . 8
A Segment trees 23
1 Introduction
Detecting and counting structural patterns in a data sequence is a common algorithmic challenge in various
theoretical and applied domains. Some of the numerous application domains include ranking and recommen-
dation [DKNS01], time series analysis [BP02], and computational biology [FDRM09], among many others.
On the mathematical/theoretical side, problems involving sequential pattern analysis naturally arise, e.g.,
in algebraic geometry [AB16], combinatorics [CDN23, Grü23], and nonparametric statistics [EZL21].
Formally, we are interested here in finding order patterns or permutation patterns, defined as follows.
Given a real-valued sequence f : [n] → R and a permutation pattern σ : [k] → [k], a copy of the pattern σ in
the sequence f is any subset of k indices i1 < i2 < . . . < ik so that for j, ℓ ∈ [k], f (ij ) < f (iℓ ) if and only if
σ(j) < σ(ℓ); see Figure 1.
f (x)
Figure 1: A configuration of n points in two dimensions (with no two points sharing the same x coordinate),
represented as a function f : [n] → R. The four full points form a copy of the permutation pattern 1432.
In the permutation pattern matching (PPM) problem,1 the task is to determine whether f contains at
least one copy of the pattern σ. In the counting variant, the goal is to return the exact or approximate
number of σ-copies in f . Recent years have seen several breakthroughs in both detection and counting,
revealing important implications in parameterized and fine-grained complexity.
Of most importance is the case where k is a small constant, which has a large number of diverse appli-
cations and interesting connections:
• Counting inversions, which are 21-patterns, that is, k = 2, is of fundamental importance for ranking
applications [DKNS01]. It has thus attracted significant attention from the algorithmic community
for the last several decades, for both exact counting [CP10, Die89, FS89] and approximate counting
[CP10, AP98].
• Counting 4-patterns2 is equivalent, by a bidirectional reduction, to counting 4-cycles in sparse graphs.
The latter is a fundamental problem in algorithmic graph theory (e.g., [AYZ97, DKS17]) and fine-
grained complexity (e.g., [WWWY15, ABKZ22, ABF23, JX23]). This equivalence was shown by
Dudek and Gawrychowski [DG20].
• Pattern counting for fixed k (especially k ≤ 5) has deep and intricate connections to (bivariate)
independece testing, a fundamental question in nonparametric statistics that asks the following. Given
n pairs of samples (x1 , y1 ), . . . , (xn , yn ) from two real continuous random variables X and Y , should we
deduce that X and Y are independent? This question has seen a long line of work in nonparametric
statistics (e.g., [EZ20, BD14, Yan70, Cha21, BKR61]). A line of work that started by Hoeffding in
the 1940’s [Hoe48] and is still very active to this day establishes distribution-free methods to test
independence by (i) ordering the sample pairs according to the values of the xi ’s, effectively treating
the yi ’s as a length-n sequence; and (ii) deciding whether X and Y are independent based on the k-
profile of yi ’s, for k ≤ 5. This is a special case of the much broader notion of U -statistics [Lee90, KB94].
See [EZ20, Grü23] for more details on this fascinating connection.
• A family of length-n permutations is considered quasirandom if, roughly speaking, the number of
occurrences of every pattern in the family (of any length) is asymptotically similar to that of a ran-
dom permutation. Quasirandomness turns out to be quite closely related to independence testing,
1 We shall interchangeably use the terms “pattern matching” and “pattern detection” to refer to this problem.
2 We henceforth use the abbreviation “k-pattern” to refer to a permutation pattern of length k.
1
discussed above, and it is known that the counts of patterns of length up to four suffice to determine
quasirandomness, see, e.g., [CDN23, Grü23].
• Permutation pattern matching allows one to deduce whether an input f is free from some pattern
σ, and consequently run much faster algorithms tailored to σ-free instances. Indeed, many classical
optimization tasks, such as binary search trees, k-server, and Euclidean TSP [BKO24] become much
faster on σ-free inputs. For example, a recent fascintating result by Opler [Opl24] shows that sorting
can be done in linear time in pattern-avoiding sequences. Pattern matching itself sometimes also
becomes faster in classes of σ-free permutations [JK17, JOP21, BBL98].
Consequently, there has been a long line of computational work on pattern matching and counting, e.g.,
[BL12, BD14, JK17, BKM21, EZ20, JOP21, Cha21, GR22]. Here, we focus on the most relevant results in
the constant k case. Notably, the version of the problem where k is large (linear in n) is NP-hard [BBL98].
Both matching and counting admit a trivial algorithm with running time O(knk ): the idea is to enumerate
over all k-tuples of indices in f , and check if each such tuple in f induces a copy of the pattern. But can
these algorithmic tasks be solved in time substantially smaller than nk ?
Pattern matching: a linear-time algorithm, and the twin-width connection. In the matching
case, the answer is resoundingly positive. The seminal work of Guillemot and Marx [GM14] shows that
PPM is a fixed parameter tractable (FPT) problem that takes O(n) time for fixed k.3 Their running time
2 2
is of the form 2O(k log k) · n; the bound was slightly improved by Fox to 2O(k ) · n [Fox13].
The technical argument of [GM14] relies on two main ingredients: the first is the celebrated result of
Marcus and Tardos [MT04] in their proof of the Stanley-Wilf conjecture [FH92, Kla00], while the second is a
novel width notion for permutations suggested in their work. The latter subsequently led to the development
of the very wide and useful notion of twin-width, which has revolutionized parametrized complexity in recent
years. Indeed, the work of Bonnet, Kim, Thomassé, and Watrigant [BKTW21], which originally defined
twin-width, begins with the following statement: “Inspired by a width invariant defined on permutations by
Guillemot and Marx [GM14], we introduce the notion of twin-width on graphs and on matrices.”
Pattern counting: algorithms and hardness. Exact counting, meanwhile, is unlikely to admit very
efficient algorithms. A series of works from the last two decades has gradually improved the nk upper
bound, obtaining bounds of the form n(c+o(1))k for constant c < 1 [AAAH01, AR08]. The current state
of the art, proved by Bernedsohn, Kozma, and Marx [BKM21] is of the form nk/4+o(k) . The same work
shows, however, that no(k/ log k) -time algorithms for exact counting cannot exist unless the exponential-time
hypothesis (ETH) is false. The above results treat k as a variable; we next focus on the case where k is very
small, given the myriad of applications discussed before.
In the case k = 2, it is easy to obtain an exact counting algorithm in time O(n log n) (in the Word RAM
model), via a variant of merge sort. A line of work [Die89, FS89, AP98, CP10] sought to obtain improved
4
algorithms for both exact and approximate counting (to within √ a 1 + ǫ multiplicative factor). The best
known exact and approximate upper bounds for k = 2 are O(n log n) and O(n), respectively, both proved
by Chan and Pătraşcu [CP10].
The cases of k = 3 and k = 4 have been the subject of multiple recent works. Even-Zohar and Leng
[EZL21] developed an object called corner tree to count a family of patterns (that slightly differ from
permutation patterns) in near-linear time. Using linear combinations of corner tree formulas, they obtained
near-linear time algorithm for all patterns of length 3 and some (8 out of 24) length-4 patterns. For the
remaining ones of length 4, the same work obtains an O(n3/2 ) time algorithm using different techniques. This
interesting dichotomy between “easy” and “hard” 4-patterns raises an interesting question: is the dichotomy
an artifact of the specific technique, or is there an inherent computational barrier?
Dudek and Gawrychowski [DG20] proved that the latter is true: exact counting of any “hard” 4-pattern
is equivalent (via bidirectional reductions) to exact counting of 4-cycles in graphs, a central and very well
3 Unless mentioned otherwise, the computational model is Word RAM, that allows querying a single function value or
2
studied problem in algorithmic graph theory. The concrete equivalence stated in their paper (see Theorem
1 there) is that an Õ(mγ )-time algorithm for counting 4-cycles in m-edge graphs implies an Õ(nγ ) time
algorithm for counting “hard” 4-patterns, and vice versa. While this has led to a slightly improved O(n1.48 )
upper bound based on best known results for counting 4-cycles in sparse graphs [WWWY15], the more
interesting direction to us is the lower bound side. A line of recent works obtains conditional lower bounds
on 4-cycle counting, that apply already for the easier task of 4-cycle detection [ABKZ22, ABF23, JX23].
These works imply that conditioning on the Strong 3-SUM conjecture, detecting whether a (sufficiently
sparse) graph with m edges contains a 4-cycle requires m1+Ω(1) time (see, e.g., the discussion after Theorem
1.14 in [JX23]), which translates to an n1+Ω(1) lower bound for exact counting 4-patterns, via [DG20].
3
1.2 Our techniques
Our approach to approximate pattern counting is based on a novel application of a known tool in distribution
testing, and on several new techniques. Each of these techniques contributes to efficient approximate counting
for small fixed patterns. Here, we outline three main ideas central to our work: (i) the Birgé technique for
exploiting structural monotonicity; (ii) using separators to impose additional structure on pattern instances;
and (iii) a specialized data structure for approximating the counts of 12 copies within axis-parallel rectangles.5
πk
9
8
7
6
5 Candidate for ”4” after fixing
4 6 as ”3” in a 1324-pattern.
3
2
1
k
Figure 2: The illustration corresponds to permutation π = 136548279, depicted in a plane at points (i, πi ).
1324 patterns in a monotone way. For example, after fixing “3” to a specific position in the permutation,
we can identify all positions of “4” that can extend this configuration into valid 1324 copies. Within this
subset, the positions of “4” exhibit a specific ordering: if “4” appears at a given position in the sequence,
any more-to-the-right occurrence of “4” will continue to yield valid 1324 copies! Similarly, we fix “2” and
then count the relevant candidates for “1”. In Section 3, we show that fixing “2” also exhibits a certain
monotonicity.
We use the Birgé decomposition to take advantage of this structure. The decomposition allows us to
break down each subset Ci into manageable, monotone classes and then efficiently approximate the count of
each class in polylogarithmic time. By structuring the count around this monotonicity, we can approximately
compute each |Ci | without directly enumerating all possibilities, which would be computationally expensive.
So, by fixing values like “3”, then “4”, and then “2”, and using the Birgé decomposition to handle
the emerging monotonic structures, we reduce the complexity of counting 1324 patterns to a series of fast
approximations, leading to O(n · poly(n, ε−1 )) running time.
5 Throughout our work, we assume the input is a permutation. Nevertheless, our proofs also handle inputs/functions that
contain points with the same y-coordinate, i.e., the proofs tolerate f (i) = f (j) for i 6= j. Also, without loss of generality, for
the problem of counting patterns, it can be assumed that f (i) ∈ {0, 1, . . . , n}.
4
1.2.2 Imposing structure through separators for 4-patterns (Section 4)
While the Birgé decomposition effectively handles some patterns, others (such as 2413) do not exhibit the
same straightforward monotonic structure. For these patterns, we introduce separators to impose additional
structural constraints.
Consider the 4-pattern 2413. Unlike 1324, this pattern does not naturally exhibit a straightforward
monotonic structure. If we fix “4” to a particular position, we would ideally like the positions of other
elements – “2”, “1”, and “3” – to show some consistent ordering so that we can apply an efficient counting
method. However, without further structuring, the placements of “1” and “3” relative to “4” do not seem
to reveal any particular order.
(i, πi )
s k
Figure 3: An illustration of the idea of using separators to split the candidates for “1” and “3” into disjoint but
neighboring regions based on their position.
To handle this, we introduce a separator to divide the possible positions of elements in 2413 based on
their relative positions to “4”. For instance, after fixing “4”, we introduce a position-based separator s that
splits the plane into two regions. We then require that “1” appears to the left of s while “3” appears to the
right of s. This allows us to approximate the count of 2413 copies within each configuration independently.
We illustrate such a separator in Figure 3. With this separator in place, the counts of 2413 copies become
monotone again, enabling us to apply the Birgé decomposition to each subset created by the separator. The
complete analysis is presented in Section 4.
5
vertical
separator
πk
d
5
4
3 horizontal
c+d separator
2 2
1
a a+b b k
2
Figure 4: This sketch depicts the notion of vertical and horizontal global separators. In this example, the vertical
dashed (blue) line is a vertical separator, splitting the range [a, b] into two equal-sized halves. The horizontal
dashed (red) line is a horizontal separator. The example also shows a (24135) copy. This copy is counted only if
(i) the “2” is to the left and the “5” is to the right of the vertical separator, and, (ii) if the “1” is below and the
“5” is above the horizontal separator.
In addition to vertical separators, we introduce horizontal separators that further partition each v based
on the y-axis. This second layer of separation divides the region into four distinct quadrants. We refer
to Figure 8 for an illustration. In addition, we consider all valid configurations of 24135 copies relative to
these quadrants. For instance, we can enforce that specific elements (e.g., “2” and “5”) fall on opposite
sides of the vertical separator and that others (e.g., “1” and “5”) fall on opposite sides of the horizontal
separator. This structure ensures that each copy of the pattern is counted exactly once within a unique
configuration. Crucially, it turns out that this structure also induces monotonicity and allows for using the
Birgé decomposition for efficient approximate counting.
1.2.4 A Primitive for Counting 12 Copies within Axis-Parallel Rectangles (Section 5.1)
Our final technique introduces a data structure for counting simple 12 patterns (increasing pairs) within
arbitrary axis-aligned rectangles. This primitive allows us to query the approximate number of 12 copies
within any subregion of the input permutation. We employ this data structure to count 5-patterns.
To develop this 12-copy counting data structure, we employ a two-dimensional segment tree described
in the previous subsection. With this tree, we pre-process the points in a bottom-up manner in O(n ·
poly(log n, ε−1 )) time. Section 5.1 details the implementation of this bottom-up pre-processing. This pre-
processing computes an approximate number of 12 copies within each vertex of the segment tree. These
pre-computed values are later used to answer queries for approximating the number of 12 copies within
arbitrary rectangles, each answered in polylogarithmic time.
6
counting and detection, and it is a priori unclear where its complexity sits between linear in n (for detection)
and nearly worst-possible (for exact counting).
Question 1.3 (Complexity of approximate counting). What is the time complexity of approximating the
number of σ-patterns in an input sequence f : [n] → R to within a (1 + ǫ)-multiplicative error, as a function
of n and k = |σ|?
Establishing tight upper and lower bounds for Question 1.3 appears to be challenging. Even for exact
pattern counting, a more extensively studied problem, there remains a gap between the best known upper
bound of nk/4+o(k) and the conditional lower bound of nΩ(k/ log k) , both attained by Berendsohn, Kozma
and Marx [BKM21]. Nevertheless, given the separation we establish for k = 4 and k = 5 (along with the
new techniques which are specially suited for approximate computation) it is tempting to conjecture that
the complexity of approximate counting in the general case, as a function of n and k, is fundamentally lower
than that of exact counting. We make the following conjecture.
Conjecture 1.4. The time complexity of approximate counting σ-copies in a length-n sequence, as a function
of n and k = |σ|, is asymptotically smaller than that of exact counting for the same parameters.
Proving any bound of the form no(k/ log k) would affirm this conjecture. But even improving upon the state
of the art for exact counting would be interesting. The current best known approach of [BKM21] formulates
the pattern matching instance as a constraint satisfaction problem (CSP) with binary constraints. The
complexity of solving this CSP is O(nt+1 ), where t is the treewidth of the incidence graph of the pattern π
(see also the work of Ahal and Rabinovich [AR08] for an earlier investigation of the role of treewidth in this
context). The basic constraint graph has treewidth bounded by k/3 + o(k); Berendsohn et al. combine the
tree-width based approach with a gridding technique based on ideas of Cygan, Kowalik, and Socala [CKS19]
to reduce the exponent to k/4 + o(k).
As we see here, algorithmic results for both detection and exact counting make use of central width
notions from the parametrized complexity literature: the former gave rise to twin-width [GM14, BKTW21]
and the latter makes heavy use of tree-width [AR08, BKM21]. It would be very intriguing to explore what
role such width notions may play in the approximate version of pattern counting. The fact that approximate
counting (in the small k case) admits techniques that go beyond the exact case may suggest that either a
complexity notion other than tree-width is at play here, or we can use the new techniques to bound the
tree-width of an easier subproblem (with more of the values constrained due to the use of, say, substructure
monotonicity and Birgé approximation).
From the lower bound side, essentially no nontrivial (superlinear) results are known for the Word RAM
model, and proving any ω(n) lower bound that applies to the approximate counting of some fixed-length
patterns would be interesting. We further conjecture that for large enough (constant) k, there should be a
strongly superlinear bound.
Conjecture 1.5. There exists a pattern σ of constant length for which approximate counting of σ in length-n
sequences requires n1+Ω(1) time.
For k = 3, 4, 5, the existing algorithms for, say, 2-approximate counting (and exact counting, for k = 3)
have time complexity n logO(1) n. This raises the question of whether the polylogarithmic dependence is
necessary (for k = 2 it is not necessary [CP10]). We conjecture that the answer is positive already for k = 4.
Finally, the use of Birgé decomposition in this paper seems to be novel in the context of pattern counting
and, perhaps more generally, in combinatorial contexts beyond the scope of distribution testing. This
decomposition is very useful in our setting as many sequences of quantities turn out to be monotone. It would
be interesting to find other counting problems in low-dimensional geometric settings where this technique,
of finding and exploiting monotone subsequences, may be useful.
7
2 Preliminaries
2.1 Algorithmic primitives
2.1.1 Segment trees
We use a natural and standard representation of permutations in which a permutation π is represented by
a set of points {(i, πi ) : i ∈ [n]} in plane. On this set of points, our algorithms perform simple counting
queries.
Lemma 2.1 (Segment tree data structure). Let π be a permutation over [n]. Define
a,b
Si,j := |{x ∈ [n] : i ≤ x ≤ j, a ≤ π(x) ≤ b}|.
a,b a,b
and Ni,j = |Si,j |. There exists a data structure that, given π, initializes in time O(n log2 n) using O(n log n)
space, and supports the following operations in time O(log 2 n):
a,b
1. Value and location counts: given indices i ≤ j ∈ [n] and values a ≤ b ∈ [n], return Ni,j .
a,b
2. Query access to locations in segment: Given i, j, a, b as above, and 1 ≤ ℓ ≤ Ni,j , return the index of the
a,b
ℓ-th leftmost element within the set Si,j .
a,b
3. Query access to values in segment: Given i, j, a, b as above, and 1 ≤ ℓ ≤ Ni,j , return the ℓ-th largest
a,b
value within the set {π(x) : x ∈ Si,j }.
Lemma 2.1 can be obtained using standard techniques in the data structure literature. For completeness,
we outline those techniques in Appendix A.
AsPan immediate application, since the sizes of the intervals Ij are known in advance, approximating the
sum ni=1 xi to within a 1 ± ε multiplicative factor can be done in O(log n/ε) time (assuming that accessing
the value of each xi requires O(1) time). This is summarized in the following lemma.
Lemma 2.4 (Fast approximation of monotone sums). Let 0 < ε < 1 and n ∈ N be known parameters, and
suppose we are given query access to a monotone sequence x1 ≥ x2 ≥ . P . . ≥ xn ≥ 0 of real numbers. Then
n
there exists a deterministic algorithm which returns a value y ∈ (1 ± ε) i=1 xi , with query complexity and
running time O(ε−1 log n).
Moreover, if the query Paccess provides a multiplicative 1 ± γ approximation, then this algorithm returns
n
a value y ∈ (1 ± γ)(1 ± ε) i=1 xi ; the algorithm is oblivious to the value of γ.
8
We note that this lemma and its proof are slightly different from the statement usually named after
Birgé. The traditional version concerns distributions and is often used in distribution testing settings [Can20,
DDS+ 13, DDS14]. It assumes that we can approximate the probability density of sub-intervals of elements,
which cannot directly be done in our setting. In contrast, our version has a different, query-based access to
the input. We also note that the dependence in the proximity parameter ε for our application is inversely
linear, whereas for tasks such as learning monotone distributions, the optimal dependence is known to be
polynomial (and superlinear) in 1/ε.
Proof of Lemma 2.3. Write Jε (n) = {I1 , . . . , Iℓ } using the same notation as in Definition 2.2. For each
interval Ij , let mj = min(Ij ) and Mj = max(Ij ). by the monotonicity of the sequence (xi )ni=1 , we know that
ℓ
X ℓ
X ℓ
X
xmj · |Ij | ≥ xij · |Ij | ≥ xMj · |Ij |
j=1 j=1 j=1
for any choice of indices ij in the statement of the lemma. Indeed, this is true since xmj ≥ xij ≥ xMj due
to the monotonicity. Next, note that we also have
ℓ
X n
X ℓ
X
xmj · |Ij | ≥ xi ≥ xMj · |Ij |,
j=1 i=1 j=1
P
since for each j, xmj |Ij | ≥ i∈Ij xi ≥ xMj |Ij |, again from the monotonicity. Thus, to complete the proof of
the lemma, it remains to prove the following inequality:
ℓ
X ℓ
X
xmj |Ij | ≤ (1 + O(ε)) · xMj |Ij |. (1)
j=1 j=1
Now, the summand corresponding to t = 1 in (2) is equal to zero, since mj = Mj when |Ij | = 1. For
2 ≤ t ≤ 1/ε, we have X X
(xmj − xMj ) ≤ (xMj−1 − xMj ) = xit−1 − xit ,
j∈At j∈At
and so the first sum in the right hand side of (2) is bounded by
1/ε
1 X
xi1 − · xi1/ε + xit .
ε t=1
9
We next claim that
1/ε i1/ε
X X
xi1 + xit ≤ O(ε) · xi .
t=1 i=1
Indeed, this followsSby observing that for each 1 ≤ t ≤ 1/ε, we have it+1 − it = Θ(1/ε), and the fact that
xi ≥ xit for all i ∈ I∈At I. Now, for any j where Ij ∈ A′ , we have that |Ij+1 | ≤ (1 + O(ε))|Ij |. Thus, by a
telescopic sum argument, we conclude that
X X ℓ
X
xmj |Ij | ≤ xMj−1 |Ij | ≤ xi1/ε |Ij (1/ε) +1 | + (1 + O(ε)) · xMj |Ij | (3)
j∈A′ j∈A′ j=j0
1
where we recall that |Ij (1/ε) +1 | = ε + 1. Combining all of the above inequalities, we have that
ℓ
X ℓ
X ℓ
X
xmj |Ij | − xMj |Ij | ≤ O(ε) · xMj |Ij |,
j=1 j=1 j=1
Our approach approximates each |Ci | independently. The main technical contribution of our work is showing
that Ci can be further partitioned into classes that exhibit certain monotonicity in their size. Our approach
employs Birgé decomposition, e.g., Lemma 2.3, to leverage that property and approximate |Ci | in only
poly log n time. We now describe the details of this idea.
10
πk
9
8
7
6
5 Candidate for ”2” after fixing
4 6 as ”3”and 9 as ”4” in a 1324-pattern.
3
2
1
Figure 5: The illustration corresponds to permutation π = 136548279, depicted in a plane at points (i, πi ).
3.2.4 Algorithm
As a reminder, Ci,j,k is the set of all 1324 copies such that “3” equals πi , “4” equals πj , and “2” equals πk .
|Ci,j,k | is computed by counting the number of points (ℓ, πℓ ) such that 1 ≤ ℓ ≤ i − 1 and 1 ≤ πℓ < πk − 1.
This can be done in poly log n time using sparse segment trees, as provided by Lemma 2.1. This now enables
us to provide the pseudo-code of our approach (Algorithm 1).
We are now ready to show the following.
11
Algorithm 1 Approximate-1324-Copies
Input: A permutation π; an approximation parameter ε > 0
Output: a 1 + ε approximation of the number of 1324 copies in π
Theorem 3.4. Given a permutation π and an approximation parameter ε ∈ (0, 1), Algorithm 1 computes a
1 ± 3ε approximation of the number of 1324 copies in π in time O(n · poly(log(n)/ε)).
Proof. We analyze separately the running time and the approximation guarantee.
Running time. There are n options to choose i. By Lemma 2.4, |J ′ |, |K ′ | ∈ O(log(n)/ε). Note that the
sets J and K need not be constructed explicitly. It suffices to, for a given t, be able to access the t-th element
of those sets, which can be done in O(log2 n) time using S. Finally, Line 9 of Algorithm 1 can be executed
in O(log2 n) time; see Lemma 2.1.
Therefore, the overall running time is O(n · poly(log(n)/ε)).
Approximation guarantee. Let ci,j,k , ci,j and ci be as defined in Algorithm 1. Observe that ci,j,k =
|Ci,j,k |. By the guarantee of the algorithm in Lemma 2.4, we have ci,j ∈ (1 ± ε)|Ci,j |.
Since ci,j are used to obtain an approximation ci of |Ci |, by Lemma 2.4 we have that ci ∈ (1±ε)(1±ε)|Ci | ∈
(1 ± 3ε)|Ci |, for ε ∈ (0, 1).
12
with respect to the position of “4”; after fixing “3” and “4”, the counts are monotone with respect to the
value of “2”. Unfortunately, copies of 2413 do not seem to exhibit such a property. To alleviate that, we
observe that there is an additional way of partitioning the copies of 2413.
To illustrate this partitioning approach, assume that we fix “4”. Then, we would like to exhibit the
monotonicity of the copy counts with respect to the value or position of at least one among “2”, “1”, and
“3”. However, this is not the case. Intuitively, the challenge here is that the tools we developed so far do
not enable us to approximate the number of copies of 12 in a given permutation in poly(log n, 1/ε) time. To
see how it affects counting 2413 copies, for instance, after fixing a “4”, no special structure is imposed on
the candidates of “1” and “3”! Indeed, even though both “1” and “3” have to be to the right and below
the fixed “4”, our algorithm still needs to (approximately) count the number of monotone pairs in a given
subarray.
What if we are concerned only with the number of 2413 copies in which the position of “1” is less than
s, while the position of “3” is greater than s? This situation is illustrated in Figure 3, and s should be
thought of as “separator”. After imposing this additional structure between “1” and “3”, the counts become
monotone with respect to the value of “3”. Hence, we can again apply the Birgé theorem for approximating
the counts.
It remains to show that there exists a small number of separators that enable counting all 2413 copies.
We dive into those details in the rest of this section, describing how to partition “3” and “4” into certain
buckets that allow for the described 2413-copy partitioning. Ultimately, this section leads the following
result:
Theorem 4.1 (Approximating 2413 copies). There exists a deterministic algorithm for approximating the
number of 2413 copies in a permutation of length n to within a multiplicative factor of 1 + ε, with running
time of n · poly(log n, 1/ε).
Organization of this section. We begin by, in Section 4.1, stating several definitions that are instrumen-
tal in describing our partitioning of 2413. Section 4.2 outlines our proof of Theorem 4.1, while Sections 4.3
and 4.4 prove the main technical claims we need in the proof of Theorem 4.1.
4.1 Preliminaries
For convenience, we let [n] := {0, 1, . . . , n − 1}. We begin by defining the notion of j-buckets and type-j
copies, which are instrumental in defining the kind of separator we use and illustrate in Figure 3. Recall
that a copy of 2413 in a permutation π : [n] → [n] is any quadruple of indices i1 < i2 < i3 < i4 such that
π(i3 ) < π(i1 ) < π(i4 ) < π(i2 ).
Definition 4.2 (Type of copy; j-buckets). For each index i ∈ [n] consider the standard binary representation
of i using ⌈log n⌉ bits, and define the j-least significant bit (or j-LSB in short) as the term corresponding to
2j in the binary representation. We say that a 2413 copy (i1 , i2 , i3 , i4 ) in π is type-j if i2 , i.e., the index of
the “4”, and i4 (the index of the “3”) differ on the j-LSB, but have equal j ′ -LSB for all j ′ > j.
Finally, two indices in [n] are said to be in the same j-bucket if their j ′ -LSB is equal for all j ′ ≥ j. This
definition is illustrated in Figure 6.
Observe that there are many j-buckets. In fact, j-buckets partition the integers into sets of 2j consecutive
integers each. For instance, the ranges of integers [0, 7], [8, 15], [16, 23], [24, 31] are all 3-buckets.
Note that a bucket consists of contiguous subintervals of [n]. Moreover, in a 2413 copy which is type-j,
the “4” and “2” are in the same (j + 1)-bucket and in different, but neighboring, j-buckets. This motivates
the following definition.
Definition 4.3 (4-heavy, 3-heavy). Consider a type-j 2413 copy (i1 , i2 , i3 , i4 ). We say that the copy is
4-heavy if i2 and i3 , i.e., the “4” and “1”, are in the same j-bucket. Otherwise, we say that the copy is
3-heavy.
Note that in a type-j copy (i1 , i2 , i3 , i4 ) that is 3-heavy, i3 (“the 1-entry”) is in the same j-bucket as i4
(“the 3-entry”). Similarly, in a type-j copy (i1 , i2 , i3 , i4 ) that is 4-heavy, i3 (“the 1-entry”) is in the same
j-bucket as i2 (“the 4-entry”). This yields the following observation.
Observation 4.4. Each type-j copy is either 3- or 4-heavy, but not both.
13
12
9
8
6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Figure 6: This example depicts a copy of 2413 equal to (2, 6, 7, 14) with π(i1 ) = 8, π(i2 ) = 12, π(i3 ) = 6 and
π(i4 ) = 9. Since i2 = (00110)2 and i4 = (01110)2 , this copy is 3-type. Moreover, we have that all i1 , i2 , i3 and
i4 are in the same 4-bucket. The indices i1 , i2 and i3 are in the same 3-bucket as well, while i2 and i3 are in
addition in the same 2-type and 1-type bucket; see Definition 4.2.
Algorithm 2 Approximate-2413-Copies
Input: A permutation π; an approximation parameter ε > 0
Output: A 1 + ε approximation of the number of 2413 copies in π
Each copy of (2413) in π is type-j for exactly one P value of j, and moreover, each such copy is either
3-heavy or 4-heavy, but not both. Hence, the sum i,j (Cj,3,i + Cj,4,i ) is a (1 + ε)-approximation of the
number of (2413)-copies in π.
Since S can be built in Õ(n) time, and each invocation to the algorithms from Lemmas 4.5 and 4.6 takes
poly(log n, 1/ε) time, Algorithm 2 runs in Õ (n · poly(1/ε)) time.
It remains to prove Lemma 4.5 and Lemma 4.6. We refer the reader to a (very schematic) illustration
for the 4-heavy case (Lemma 4.5).
14
Fixed “4”
πk The candidates for “3”
(i, πi )
t · 2j (t + 1) · 2j − 1
(t + 2) · 2j k
2
Figure 7: A helper illustration for the proof of Lemma 4.5. In this sketch, t is an integer. The shaded rectangle
corresponds to the set X of “candidates for 3”.
“3” candidates. We approximate |Ai | by first fixing the “candidate for 3”; the “candidate for 4” is already
fixed by the definition of Ai . Moreover, we show that a particular function is monotone with respect to those
candidates, which will enable us to apply Lemma 2.4; we invoke Lemma 2.4 with parameter ε/3. Formally,
let X be the set of indices with the following two properties: (1) the indices in the j-bucket immediately
neighboring to “the right” the j-bucket i belongs to, and (2) the indices whose value is smaller than π(i). In
Figure 7, X corresponds to the shaded area.
For each such candidate x ∈ X, define f (x) as the number of (2413)-copies in Ai . Importantly, f (x) is
“monotone by value” within the relevant bucket. Precisely, within Ai , f (x) is non-decreasing as a function
of π(x) over x ∈ X. This is easy to see as for x, x′ ∈ X such that π(x) > π(x′ ), if (i1 , i, i3 , x′ ) is a (2413)
copy, then (i1 , i, i3 , x) is a (2413) copy as well.
P
Approximating |Ai |. By definition, we have that P |Ai | = x∈X f (x). Moreover, since f (x)′ is monotone
with respect to π(x) over x ∈ X, we approximate x∈X f (x) by applying Lemma 2.4. Let X ⊆ X be the
subset of size O(log n/ε) of indices, and corresponding to Lemma 2.4, for which is needed to (approximately)
compute f (x) for x ∈ X ′ . Observe that all the elements in X belong to a well-defined rectangle. Hence,
each point in X ′ can be found in poly log n time.
Approximating f (x). Let x ∈ X ′ . Similarly to before, we f (x) in poly(log n, 1/ε) time using the segment
tree and another application of the Birgé technique.
Indeed, let Sx be the set of all 1-candidates, which are elements between i and the rightmost end of its
j-bucket, and whose values are less than π(x). For each y ∈ S, let g(y) denote the number of (2413)-copies
of the form (i1 , i, y, x). Note that g(y) is monotone non-increasing in value. That is, when π(y) increases,
the number of (2413)-copies in Ai that y participates in as a “1” can only decrease. Moreover, it is easy to
6 Throughout our proofs, it is instructive to picture the input as a set of points with coordinates (i, πi ) for all i ∈ [n]. The
terminology such as “left”, “right”, “above”, and “below” is defined with respect to that depiction of the input.
15
compute g(y) exactly for a specific value of y by invoking a single operation specified by Lemma 2.1. That
operation would count all elements that are larger than π(y), smaller than π(x), and are located to the left
of i. P
Now, because of the monotonicity of g, and because f (x) = y∈Sx g(y), we apply the Birgé technique
(Lemma 2.4) to approximate f (x) for any specific value of x ∈ X to within a (1+ε/3)-factor using O(log n/ε)
computations of g(y).
Each of the applications of Birgé introduces a multiplicative error of 1 ± ε/3. Provided that ε ≤ 1/2, the
total multiplicative error is less than 1 ± ε.
16
Two-dimensional segment tree. As the starting point, we build a two-dimensional segment tree over
the points (i, πi ) over all i ∈ [n]. In Appendix A, we already recall the definition of a segment tree and
describe how we use it to count 4-patterns. For counting 5-patterns, we build a two-dimensional segment
tree as follows:
(1) A segment tree S is built over the points (i, πi ) with respect to their x-coordinate. We also use outer
segment tree to refer to S.
(2) Consider a vertex v in S, and let [a, b] be the interval of the x-axis v corresponds to. Then, v stores all
the points (i, πi ) such that a ≤ i ≤ b.
(3) The points inside each vertex v of S are organized as a segment tree with respect to the y-coordinate of
the v’s points. We call these segment trees inner.
(4) Let v be a vertex in the outer and w a vertex in the v’s inner segment tree. Let v correspond to [a, b]
and w to [c, d]. Then, w stores all the points within rectangle [a, b] × [c, d], i.e., w stores all (i, πi ) such
that a ≤ i ≤ b and c ≤ πi ≤ d. Two copies of those points are kept. One copy is sorted with respect to
the x-coordinates and the other copy is sorted with respect to the y-coordinate.
Hence, S is a segment tree of segment trees. The outer segment tree partitions the plane with respect to the
x-coordinate into recursively nested strips. The inner segment trees partition each of the strips into another
family of recursively nested strips but with respect to the y-coordinate.
A point (i, πi ) is replicated within O(log n) vertices in the outer segment tree. Each of those outer vertices
replicates (i, πi ) O(log n) times within its inner segment tree. Hence, a point is replicated O(log2 n) times
within S.
This conclusion has two implications. First, the points inside the vertices of the inner segment trees can
be sorted in O(n log3 n) time. Second, the total number of non-empty vertices across all inner segment trees
is O(n log2 n). This is essential as it enables us to build and maintain S in only Õ(n) time by not creating
the vertices that contain no point inside.
Pre-processing 12 copy counts. Once the two-dimensional segment tree S is built as described, we
process its vertices to pre-compute the number of 12 copies inside each vertex of the inner segment trees.
First, we show the following claim.
Lemma 5.2 (12 copies across disjoint useful rectangles). Given two distinct vertices w1 and w2 belonging
to inner segment trees of S, we can (1 + ε) approximate the number of 12 copies i1 , i2 with (i1 , πi1 ) inside
w1 and (i2 , πi2 ) inside w2 in O(ε−1 log2 n) time.
Proof. Let R1 be the rectangle corresponding to w1 , and R2 be the rectangle corresponding to w2 . If R2 is
below or left of R1 , we just return 0. If R2 is up and right of R1 , we return |R1 | · |R2 |.
Without loss of generality, assume that R2 is to the right of R1 . We now use Birgé theorem to approximate
the number of relevant 12 pairs as follows. Note that we simply need to return pairs of points (u, v) from
R1 × R2 such that u is below v. The higher up the u is, the fewer (u, v) pairs there are. Thus, we only need
to compute the number of possible v, for O(ε−1 log n) different possibilities of u by Lemma 2.4, which can
be effectively computed in O(log n) time each.
Lemma 5.3. There is an algorithm that in O(ε−1 · n log4 n) time (1 + ε) approximates the number of 12
copies within each vertex of the segment tree.
Proof. We calculate this for all non-empty vertices by building upwards, starting from the leaf vertices and
using these results to compute the approximate counts for their parents.
Consider a vertex of the segment trees, and let R be the rectangle it corresponds to. If R contains only
a single point, then the number of 12 copies is 0. Otherwise, R splits into two rectangles, R1 and R2 ; we
discuss this split below. The number of 12 copies in R is approximated by approximating the number of
copies within R1 and R2 and approximating the number of 12 copies across R1 and R2 . The former counts
are already precomputed, while the latter counts are approximated by Lemma 5.2.
17
On the split of R into R1 and R2 . Let [a, b] × [c, d] be the rectangle R, v be the vertex in the outer
segment tree of S corresponding to [a, b], and w be the vertex in the v’s segment tree corresponding to [c, d].
If d − c > 1, then R1 and R2 are w’s children.
However, if d − c = 1, then R1 and R2 are the rectangles corresponding to [a, (a + b)/2] × [c, d] and
[(a + b)/2, b] × [c, d]. That is, R1 and R2 are inside the outer children of v.
Counting each copy exactly once Consider a vertex v in the outer segment tree of S. Now, let w be
a vertex in the inner segment tree of v. As a running example, consider the 5-pattern 24135, and let C
represent a copy of this pattern. To ensure that C is contained in v but not in its children, we require that
the leftmost position, i.e., the position of “2”, is to the left of the vertical separator, and that the position
of “5” is to the right.
Similarly, to ensure that w is the smallest vertex in v’s inner segment tree that contains C, we require
that the topmost value, i.e., “5”, is above, and the value of “1” is below the horizontal separator. This setup
is sketched in Figure 4.
To conclude, observe that C is in the root of S, ensuring that C is counted by some vertex w. Second,
for any copy C, there exists a unique vertex w that counts C: only w and its parent contain C, while none of
w’s siblings do. This is because w’s siblings correspond to disjoint rectangles by the construction of segment
trees.
18
Configurations. Our main algorithm counts 5-patterns by distributing these counts across the inner
vertices of S, as discussed in Section 5.2. Once a vertex w is fixed, we ensure that only copies not belonging
to any of w’s children are counted.
Fixing w induces a horizontal and vertical separator. First, our algorithm considers all valid configura-
tions, such as: “1” and “2” are below the horizontal separator while “3”, “4”, and “5” are above; and “2”
and “4” are to the left of the vertical separator, while “1”, “3”, and “5” are to the right.
Second, for each configuration, the algorithm fixes one value (e.g., “4”). The choice of which value to
fix is guided by our algorithm, which provides a “recipe” on leveraging the Birgé technique (Lemmas 2.3
and 2.4) and the 12-copy primitive (Section 5.1).
vertical
separator
πk
d
5
4
vLA 3 vRA horizontal
c+d separator
2 2
1
vLB vRB
a a+b b k
2
the range of v
Figure 8: This is a more detailed example compared to the one provided in Figure 4 to aid the discussion in
Section 5.3. Here, v corresponds to the rectangle [a, b] × [0, ∞). Its two children, vL and vR , correspond to
rectangles [a, (a + b)/2] × [0, ∞) and [(a + b)/2, b] × [0, ∞).
Location of the candidates. We now discuss the locations of candidates for “1”, “2”, “3”, “4”, and “5”.
Let v be the vertex in the outer segment tree which contains w. Let vL and vR be the two v’s children
in the outer segment tree. Let [a, b] × [c, d] correspond to w. Finally, define vLB to bethe vertex in vL ’s
segment tree corresponding to a, a+b c+d a+b c+d
2 × c, and vLA corresponding to a, × , d . Similarly
2 c+d c+d 2 2
define vRB and vRA to correspond to a+b and a+b
2 , b × c, 2 2 ,b × 2 , d , respectively. It may be
helpful to interpret ‘L’ as left, ‘R’ as right, ‘B’ as below, and ‘A’ as above the corresponding separators. One
such example is illustrated in Figure 8.
The algorithm iterates over points in vLA to select a candidate for “4”. Similarly, when the Birgé
technique is applied to consider the candidates for “2”, it is applied within the points of vLB , and so on.
Again, the choice of which value to fix in this configuration is made by the algorithm to enable the use of
the Birgé technique (Lemmas 2.3 and 2.4) and the 12-copy primitive (Section 5.1).
Time complexity. Recall that our algorithm first fixes global separators. Then, for a given 5-pattern, it
considers all possible configurations. For each configuration, the algorithm also generates a recipe on which
19
value to fix and how to utilize the Birgé technique and the 12-copy primitive. Importantly, it suffices to
fix only one element for a given configuration. The rest of the counting is carried by applying the Birgé
technique and using the 12-copy primitive.
This leads to a total time complexity of Õ(n · poly(ε−1 )).
When the recipe “does not” work: 13524 and 14253. It can be shown from the output of our code
(see here: https://github.com/omribene/approx-counting/blob/main/5-patterns.txt) that there are precisely
two equivalence classes for which the above recipe does not work. These are the classes corresponding to the
patterns 13524 and 14253 where, additionally, the vertical and horizontal separator appear right next to the
“1” element. In other words, the “1” appears in the bottom-left area, and the rest of the pattern appears in
the top-right area.
We note that in these two cases, the top-right part of the pattern is order-equivalent to the pattern 2413.
Thus, it is possible to approximately count the number of 2413-copies (or 3142-copies, in the second case) in
this top-right block in near-linear time, using our mechanism for approximate counting 4-patterns. Counting
the number of values in the bottom left in linear time is trivial. The total count of 13524 (or 14253) copies
in the full block is the product of these two quantities, and the proof follows.
On the proof of Theorem 1.2. To prove Theorem 1.2, we conduct the following modification to our
approximate counting algorithm. Whenever the latter algorithm accesses (and/or aims to evaluate) elements
from a monotone sequence x1 ≥ x2 ≥ . . . xr using the Birgé technique, the enumeration algorithm will
enumerate over all elements in the sequence one by one, starting from the largest value x1 and the location
corresponding to it in the input function, and descending in value through the sequence. It is straightforward
to verify that, due to the monotonicity of all sequences of quantities considered, the algorithm will list all
copies of the pattern throughout its run.
We note that Albert, Aldred, Atkinson, and Holton [AAAH01] employed a somewhat similar technique
to construct a near-linear algorithm for the detection variant, specifically for the case k = 4; see Section 4
in their paper.
References
[AAAH01] Michael H. Albert, Robert E. L. Aldred, Mike D. Atkinson, and Derek A. Holton. Algorithms
for pattern involvement in permutations. In 12th International Symposium on Algorithms and
Computation (ISAAC), pages 355–367, 2001. ↑2, ↑20
[AB16] Hiraku Abe and Sara Billey. Consequences of the Lakshmibai-Sandhya theorem: the ubiquity
of permutation patterns in Schubert calculus and related geometry. Advanced Studies in Pure
Mathematics, 71:1–52, 2016. ↑1
[ABF23] Amir Abboud, Karl Bringmann, and Nick Fischer. Stronger 3-SUM lower bounds for approx-
imate distance oracles via additive combinatorics. In Proceedings of the 55th Annual ACM
Symposium on Theory of Computing (STOC), pages 391–404, 2023. ↑1, ↑3
[ABKZ22] Amir Abboud, Karl Bringmann, Seri Khoury, and Or Zamir. Hardness of approximation in P
via short cycle removal: Cycle detection, distance oracles, and beyond. In Proceedings of the
54th Annual ACM Symposium on Theory of Computing (STOC), pages 1487–1500, 2022. ↑1,
↑3
[AP98] Arne Andersson and Ola Petersson. Approximate indexed lists. Journal of Algorithms,
29(2):256–276, 1998. ↑1, ↑2, ↑16
[AR08] Shlomo Ahal and Yuri Rabinovich. On complexity of the subpattern problem. SIAM Journal
on Discrete Mathematics, 22(2):629–649, 2008. ↑2, ↑7
[AYZ97] N. Alon, R. Yuster, and U. Zwick. Finding and counting given length cycles. Algorithmica,
17(3):209–223, 1997. ↑1
20
[BBL98] Prosenjit Bose, Jonathan F. Buss, and Anna Lubiw. Pattern matching for permutations.
Information Processing Letters, 65(5):277–283, 1998. ↑2
[BD14] Wicher Bergsma and Angelos Dassios. A consistent test of independence based on a sign
covariance related to Kendall’s tau. Bernoulli, 20(2):1006–1028, 2014. ↑1, ↑2
[Bir87] Lucien Birgé. On the risk of histograms for estimating decreasing densities. The Annals of
Statistics, 15(3):1013 – 1022, 1987. ↑4, ↑8
[BKM21] Benjamin Aram Berendsohn, László Kozma, and Dániel Marx. Finding and counting permu-
tations via CSPs. Algorithmica, 83(8):2552–2577, 2021. ↑2, ↑7
[BKO24] Benjamin Aram Berendsohn, László Kozma, and Michal Opler. Optimization with pattern-
avoiding input. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing,
STOC 2024, pages 671–682, 2024. ↑2
[BKR61] J. R. Blum, J. Kiefer, and M. Rosenblatt. Distribution free tests of independence based on the
sample distribution function. The Annals of Mathematical Statistics, 32(2):485–498, 1961. ↑1
[BKTW21] Édouard Bonnet, Eun Jung Kim, Stéphan Thomassé, and Rémi Watrigant. Twin-width I:
Tractable FO model checking. Journal of the ACM, 69(1), 2021. ↑2, ↑7
[BL12] Marie-Louise Bruner and Martin Lackner. A fast algorithm for permutation pattern matching
based on alternating runs. In Scandinavian Workshop on Algorithm Theory (SWAT), pages
261–270, 2012. ↑2
[BP02] Christoph Bandt and Bernd Pompe. Permutation entropy: A natural complexity measure for
time series. Phys. Rev. Lett., 88:174102, 2002. ↑1
[Can20] Clément L. Canonne. A Survey on Distribution Testing: Your Data is Big. But is it Blue?
Number 9 in Graduate Surveys. Theory of Computing Library, 2020. ↑4, ↑8, ↑9
[CDN23] Gabriel Crudele, Peter Dukes, and Jonathan A. Noel. Six permutation patterns force quasir-
andomness. arXiv:2303.04776, 2023. ↑1, ↑2
[Cha21] Sourav Chatterjee. A new coefficient of correlation. Journal of the American Statistical Asso-
ciation, 116(536):2009–2022, 2021. ↑1, ↑2
[CKS19] Marek Cygan, Lukasz Kowalik, and Arkadiusz Socala. Improving tsp tours using dynamic
programming over tree decompositions. ACM Trans. Algorithms, 15(4), 2019. ↑7
[CP10] Timothy M. Chan and Mihai Pătraşcu. Counting inversions, offline orthogonal range counting,
and related problems. In Proceedings of the Twenty-First Annual ACM-SIAM Symposium on
Discrete Algorithms (SODA), pages 161–173, 2010. ↑1, ↑2, ↑3, ↑7, ↑16
[DDS+ 13] Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio, Gregory Valiant, and Paul
Valiant. Testing k-modal distributions: Optimal algorithms via reductions. In Proceedings of
the 2013 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1833–1852,
2013. ↑4, ↑9
[DDS14] Constantinos Daskalakis, Ilias Diakonikolas, and Rocco A. Servedio. Learning k-modal distri-
butions via testing. Theory of Computing, 10(20):535–570, 2014. ↑4, ↑9
[DG20] Bartlomiej Dudek and Pawel Gawrychowski. Counting 4-patterns in permutations is equivalent
to counting 4-cycles in graphs. In 31st International Symposium on Algorithms and Computa-
tion (ISAAC), pages 23:1–23:18, 2020. ↑1, ↑2, ↑3
[Die89] Paul F. Dietz. Optimal algorithms for list indexing and subset rank. In Workshop on Algorithms
and Data Structures (WADS), pages 39–46, 1989. ↑1, ↑2
21
[DKNS01] Cynthia Dwork, Ravi Kumar, Moni Naor, and D. Sivakumar. Rank aggregation methods for
the web. In Proceedings of the 10th International Conference on World Wide Web (WWW),
pages 613–622, 2001. ↑1
[DKS17] Søren Dahlgaard, Mathias Bæk Tejs Knudsen, and Morten Stöckel. Finding even cycles faster
via capped k-walks. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory
of Computing, STOC 2017, pages 112–120, 2017. ↑1
[EZ20] Chaim Even-Zohar. independence: Fast rank tests. arXiv:2010.09712, 2020. ↑1, ↑2
[EZL21] Chaim Even-Zohar and Calvin Leng. Counting small permutation patterns. In Proceedings of
the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2288–2302, 2021. ↑1,
↑2, ↑3, ↑10
[FDRM09] Guillaume Fertin, Anne Denise, Isabelle Raffinot, and André Jean Pierre Mary. Combinatorics
of Genome Rearrangements. The MIT Press, 2009. ↑1
[FH92] Zoltán Füredi and Péter Hajnal. Davenport-Schinzel theory of matrices. Discrete Mathematics,
103(3):233–251, 1992. ↑2
[Fox13] Jacob Fox. Stanley-Wilf limits are typically exponential. arXiv:1310.8378, 2013. ↑2
[FS89] Michael L. Fredman and Michael E. Saks. The cell probe complexity of dynamic data structures.
In Proceedings of the Twenty-First Annual ACM Symposium on Theory of Computing (STOC),
pages 345–354, 1989. ↑1, ↑2
[GM14] Sylvain Guillemot and Dániel Marx. Finding small patterns in permutations in linear time.
In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms,
SODA ’14, page 82–101, 2014. ↑2, ↑3, ↑7
[GR22] Pawel Gawrychowski and Mateusz Rzepecki. Faster exponential algorithm for permutation
pattern matching. In Symposium on Simplicity in Algorithms (SOSA), pages 279–284, 2022.
↑2
[Grü23] R. Grübel. Ranks, copulas, and permutons. Metrika, 2023. ↑1, ↑2
[Hoe48] Wassily Hoeffding. A non-parametric test of independence. The Annals of Mathematical Statis-
tics, 19(4):546 – 557, 1948. ↑1
[JK17] Vı́t Jelı́nek and Jan Kynčl. Hardness of permutation pattern matching. In Proceedings of the
2017 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 378–396, 2017.
↑2
[JOP21] Vı́t Jelı́nek, Michal Opler, and Jakub Pekárek. Griddings of Permutations and Hardness of
Pattern Matching. In 46th International Symposium on Mathematical Foundations of Computer
Science (MFCS), pages 65:1–65:22, 2021. ↑2
[JX23] Ce Jin and Yinzhan Xu. Removing additive structure in 3SUM-based reductions. In Proceed-
ings of the 55th Annual ACM Symposium on Theory of Computing (STOC), pages 405–418.
Association for Computing Machinery, 2023. ↑1, ↑3
[KB94] V. S. Koroljuk and Yu. V. Borovskich. Theory of U-Statistics, volume 273 of Mathematics and
Its Applications. Springer Netherlands, 1994. ↑1
[Kla00] Martin Klazar. The Füredi-Hajnal conjecture implies the Stanley-Wilf conjecture. In Formal
Power Series and Algebraic Combinatorics, pages 250–255, 2000. ↑2
[Lee90] A. J. Lee. U-Statistics: Theory and Practice. CRC Press, New York, 1990. ↑1
[MT04] Adam Marcus and Gábor Tardos. Excluded permutation matrices and the Stanley–Wilf con-
jecture. Journal of Combinatorial Theory, Series A, 107(1):153–160, 2004. ↑2
22
[Opl24] Michal Opler. An optimal algorithm for sorting pattern-avoiding sequences. arXiv:2409.07868,
2024. To appear in FOCS’24. ↑2
[WWWY15] Virginia Vassilevska Williams, Joshua R. Wang, Ryan Williams, and Huacheng Yu. Finding
four-node subgraphs in triangle time. In Proceedings of the Twenty-Sixth Annual ACM-SIAM
Symposium on Discrete Algorithms (SODA), pages 1671–1680, 2015. ↑1, ↑3
[Yan70] Takemi Yanagimoto. On measures of association and a related problem. Annals of the Institute
of Statistical Mathematics, 22(1):57–63, 1970. ↑1
A Segment trees
We represent a permutation π as a set of n points (i, πi ) for each i ∈ [n]. Building a sparse segment tree
over these points to allow for two-dimensional counting queries is a textbook problem. For completeness, we
outline the construction and query support of this data structure as per Lemma 2.1.
Building the segment tree. We aim to build a data structure to answer the two-dimensional queries
Lemma 2.1 requires. To achieve this, we construct two segment trees: one for the points (i, πi ) and one for
the points (n − πi , i), for all i ∈ [n]. We use S to refer to the first one, while we use S̃ to refer to the second
one. We now describe how to build S; the tree S̃ is built analogously.
Let ñ = 2⌈log n⌉ . We first build a segment tree S on the x-coordinates, covering the range [1, ñ]. That tree
can be visualized as a complete binary tree on ñ leaves. The root vertex corresponds to the entire interval,
its left child to the interval [1, ñ/2], and its right child to the interval [ñ/2 + 1, ñ]. In general, if a vertex
corresponds to the interval [t, t + 2j − 1], its left and right children correspond to the intervals [t, t + 2j−1 − 1]
and [t + 2j−1 , t + 2j − 1], respectively.
Second, consider a vertex v in S and let [a, a + 2j − 1] be the range v corresponds to. The vertex v
stores in an array Av all the points (i, πi ) such that a ≤ i ≤ a + 2j − 1. Av is sorted with respect to the
y-coordinates, i.e., with respect to πi .
It is folklore, and also easy to prove, that a point (i, πi ) is stored in log ñ vertices v of S. Therefore, a
point (i, πi ) is replicated log ñ times inside S.
To populate S from π, we insert the points (i, πi ) one by one, adding them to a list Lv for each vertex v
covering the corresponding range. After the insertions, each Lv is then sorted to form the array Av . There
are O(n log n) points in S, partitioned across different Av . Hence, sorting all of them takes O(n log2 n) time.
Implementing desired operations. Lemma 2.1 specifies three operations that need to be supported on
S and S̃.
The first operation counts the points within the rectangle [i, j] × [a, b]. The range [i, j] can be partitioned
into O(log n) disjoint ranges, each associated with a vertex in S. For each vertex v, we count points in Av
with y-coordinates in [a, b] using two binary searches, each in O(log n) time. Hence, this operation can be
implemented in O(log2 n) time.
For the second operation, let v1 , . . . , vk denote the O(log n) vertices in S covering disjoint subranges that
collectively form [i, j]. Let ℓ be the index as described in the second operation, i.e., we are looking for the
a,b
ℓ-th leftmost element in Si,j . To implement this, the algorithm finds the largest k ′ such that k ′ ≤ k and the
cumulative number of points within [i, j] × [a, b] across v1 , v2 , . . . , vk′ is less than ℓ, denoted by ℓ′ . Next, we
search for the (ℓ − ℓ′ )-th leftmost point within the left and right children of vk′ +1 . This approach processes
O(log n) vertices in S, each performing two binary searches, for a total time complexity of O(log2 n)
The third operation on S is equivalent to querying S̃ as in the second operation.
23