จำนวนคำถามที่แย่ที่สุดที่จำเป็นในการเรียนรู้คำกริยาโมโนโทนิกในการโพสท่า

15

พิจารณา $(X, \leq)$ poset จำกัด เหนือรายการ $n$ และ $P$ ที่ไม่ทราบคำกริยาแสดงคำพูดเหนือ $X$ (เช่นสำหรับใด ๆ $x$ , $y \in X$ หาก $P(x)$ และ $x \leq y$ แล้ว $P(y)$ ) ฉันสามารถประเมิน $P$ โดยระบุหนึ่งโหนด $x \in X$ และค้นหาว่า $P(x)$ ถืออยู่หรือไม่ เป้าหมายของฉันคือการพิจารณาว่าชุดของโหนด $x \in X$ ที่ $P(x)$ ถือใช้โดยใช้การประเมิน $P$ ที่สุดเท่าที่จะทำได้ (ฉันสามารถเลือกคำค้นหาของฉันได้ขึ้นอยู่กับคำตอบของคำค้นหาก่อนหน้าทั้งหมดฉันไม่จำเป็นต้องวางแผนคำถามทั้งหมดล่วงหน้า)

กลยุทธ์ $S$ มากกว่า $(X, \leq)$ เป็นฟังก์ชั่นที่บอกฉันเป็นฟังก์ชั่นการค้นหาที่ฉันวิ่งเพื่อให้ห่างไกลและคำตอบของพวกเขาซึ่งโหนดแบบสอบถามและซึ่งทำให้มั่นใจได้ว่าเมื่อใดกริยา $P$ โดยดำเนินการตามกลยุทธ์ ฉันจะไปถึงสถานะที่ฉันรู้ค่าของ $P$ บนโหนดทั้งหมด เวลาทำงาน $r(S, P)$ ของ $S$ บนเพรดิเคต $P$ คือจำนวนของเคียวรีที่ต้องการทราบค่าของ $P$ บนโหนดทั้งหมด เวลาทำงานที่เลวร้ายที่สุดของ $S$ คือ )กลยุทธ์ที่ดีที่สุดเป็นเช่นนั้น ) $wr(S) = \max_P r(S, P)$ $S'$ $wr(S') = \min_S wr(S)$

คำถามของฉันมีดังต่อไปนี้: เมื่อป้อนข้อมูล poset ฉันจะกำหนดเวลาทำงานที่เลวร้ายที่สุดของกลยุทธ์ที่ดีที่สุดได้อย่างไร $(X, \leq)$

[เห็นได้ชัดว่าจะต้องมีการสอบถามที่ว่างเปล่า poset (เราจำเป็นต้องถามเกี่ยวกับแต่ละโหนดเดียว) และสำหรับการสั่งซื้อทั้งหมดรอบแบบสอบถามจะต้อง (ทำการค้นหาแบบไบนารีเพื่อค้นหาชายแดน ) ผลลัพธ์ทั่วไปมากขึ้นคือขอบเขตล่างข้อมูล - ทฤษฎีต่อไปนี้: จำนวนตัวเลือกที่เป็นไปได้สำหรับเพรดิเคตคือจำนวนของ antichains ของ (เนื่องจากมีการทำแผนที่แบบหนึ่งต่อหนึ่งระหว่างภาคแสดงและ antichains ตีความว่าเป็นองค์ประกอบสูงสุดของ $n$ $\lceil \log_2 n \rceil$ $P$ $N_X$ $(X, \leq)$ $P$ ) ดังนั้นเนื่องจากแต่ละแบบสอบถามจะช่วยให้เราหนึ่งบิตของข้อมูลที่เราจะต้องไม่น้อยกว่าแบบสอบถาม subsuming สองกรณีก่อนหน้านี้ สิ่งนี้ถูกผูกไว้อย่างแน่นหนาหรือว่าพวกมันมีโครงสร้างบางอย่างที่การเรียนรู้สามารถต้องการการสืบค้นมากกว่าจำนวน antichains แบบ asymptotically?] $\lceil \log_2 N_X \rceil$

— a3nm
แหล่งที่มา

2

สิ่งนี้แตกต่างจากคำถามก่อนหน้าของคุณในหัวข้อนี้อย่างไร cstheory.stackexchange.com/questions/14772/ …

— Suresh Venkat

1

เห็นด้วยมันคล้ายกัน แต่ฉันสนใจเกี่ยวกับโพสต์ทั่วไปที่นี่รวมถึงโพสต์ที่มีความกว้างเล็ก ๆ ที่ไม่ได้มองเหมือนตาข่ายที่สมบูรณ์ นอกจากนี้ฉันไม่สนใจอีกต่อไปเกี่ยวกับความซับซ้อนที่เพิ่มขึ้นหรือสิ่งใด ๆ เพียงแค่จำนวนคำถามที่ต้องการเป็นฟังก์ชันของตัวเลือกของโพสต์ ในการตั้งค่านี้การตีความฟังก์ชั่นบูลีนนั้นใช้ไม่ได้และดูเหมือนว่าคำตอบนั้นขึ้นอยู่กับ "โครงสร้าง" ของ poset (อาจเป็นจำนวน antichains ตามที่ผมแนะนำ) หวังว่านี่จะรับประกันคำถามแยกต่างหากโปรดปิดถ้าฉันผิด

— a3nm

1

FYI ในวรรณคดีที่ซับซ้อนกลยุทธ์ตามที่คุณกำหนดไว้โดยทั่วไปจะเรียกว่า "ต้นไม้การตัดสินใจ" และพวกเขามีความคิดมาตรฐานของความสูง (วัดที่คุณสนใจ) และขนาด

— Joshua Grochow

ขอบคุณโจชัว! ฉันรู้เรื่องนี้มากขึ้นหรือน้อยลงฉันคิดว่ามันง่ายกว่าที่จะใช้คำศัพท์จากทฤษฎีเกม แต่ใช่ฉันรู้ว่ากลยุทธ์สามารถมองเห็นเป็นต้นไม้ได้

— a3nm

1

(ไม่มีปัญหาโดยวิธีการที่ฉันไม่ได้เพียงแค่ชี้ให้เห็นว่ามันสามารถมองเห็นเป็นต้นไม้วิธีที่คุณอธิบายมันตรงไปตรงมาและชัดเจนมาก แต่ฉันก็ให้คำหลัก / คำศัพท์ศิลปะที่คุณอาจ สามารถค้นหานอกเหนือจากคำที่อาจคุ้นเคยทันทีกับคนจำนวนมากที่ใช้เว็บไซต์นี้บ่อยครั้งไชโย!)

— Joshua Grochow

7

นี่ไม่ใช่คำตอบที่สมบูรณ์ แต่มันยาวเกินกว่าจะแสดงความคิดเห็นได้

ฉันคิดว่าฉันพบตัวอย่างที่ขอบเขตไม่แน่น $\lceil \log_2 N_X \rceil$

พิจารณาโพสท่าต่อไปนี้ ชุดล่างเป็นและมีขนาดเล็กกว่าทั้งหมดของ }คู่อื่น ๆ นั้นหาที่เปรียบมิได้ (แผนภาพ Hasse เป็นรอบ) $X=\{a_1, a_2, b_1, b_2\}$ $a_i$ $b_j$ $i,j\in\{1,2\}$ $4$

ให้ฉันระบุคุณสมบัติเสียงเดียวกับ upsets ของ poset ปัญหานี้มีเจ็ดอารมณ์เสีย: , , , , , , $\emptyset$ $\{b_1\}$ $\{b_2\}$ $\{b_1,b_2\}$ $\{a_1,b_1,b_2\}$ $\{a_2,b_1,b_2\}$ และตำแหน่งนี้มีเจ็ด antichains เนื่องจาก antichains อยู่ในการติดต่อแบบตัวต่อตัวกับ upsets ดังนั้นสำหรับนี้ $\{a_1,a_2,b_1,b_2\}$ $\lceil \log_2 N_X \rceil=\lceil \log_2 7 \rceil = 3$

ตอนนี้โดยอาร์กิวเมนต์ที่เป็นปฏิปักษ์ฉันจะแสดงให้เห็นว่ากลยุทธ์ใด ๆ ต้องการการค้นหาอย่างน้อยสี่ครั้ง (ดังนั้นจึงต้องค้นหาองค์ประกอบทั้งหมด) มาแก้ไขกลยุทธ์โดยพลการ

หากกลยุทธ์แรก queries แล้วคำตอบของฝ่ายตรงข้าม " ไม่ถือ." จากนั้นเราจะเหลือห้าเป็นไปได้: , , , , }ดังนั้นเพื่อตรวจสอบว่าเป็นกรณีใดเราต้องอย่างน้อย $a_1$ $P(a_1)$ $\emptyset$ $\{b_1\}$ $\{b_2\}$ $\{b_1,b_2\}$ $\{a_2,b_1,b_2\}$ $\lceil \log_2 5\rceil = 3$ คำถามเพิ่มเติม โดยรวมแล้วเราต้องการคำค้นหาสี่คำ อาร์กิวเมนต์เช่นเดียวกับถ้าแบบสอบถามแรกคือ2 $a_2$

หากกลยุทธ์แรกสอบถามแล้วศัตรูตอบ " ถือ" จากนั้นเรามีความเป็นไปได้ห้าอย่าง: , , , , $b_1$ $P(b_1)$ $\{b_1\}$ $\{b_1,b_2\}$ $\{a_1,b_1,b_2\}$ $\{a_2,b_1,b_2\}$ }ดังนั้นเราต้องการข้อความค้นหาเพิ่มเติมอย่างน้อยสามคำเหมือนเดิม โดยรวมแล้วเราต้องการสี่ข้อความค้นหา อาร์กิวเมนต์เช่นเดียวกับเมื่อแบบสอบถามแรกคือ 2 $\{a_1,a_2,b_1,b_2\}$ $b_2$

ถ้าเราใช้เวลาสำเนาแบบขนานของ poset นี้แล้วมันมี antichains และจึงเสนอผูกพันเป็น kแต่เนื่องจากแต่ละสำเนาที่ต้องการสี่คำสั่งเราต้องไม่น้อยกว่าคำสั่ง $k$ $7^k$ $\lceil \log_2 7^k \rceil = 3k$ $4k$

อาจเป็นไปได้ว่ามีตำแหน่งที่ใหญ่กว่าที่มีช่องว่างขนาดใหญ่ แต่อาร์กิวเมนต์นี้สามารถปรับปรุงค่าสัมประสิทธิ์ได้เท่านั้น

ที่นี่ปัญหาดูเหมือนว่าจะเป็นสถานการณ์ที่ไม่มีคิวรีพาร์ติชันของพื้นที่การค้นหาอย่างเท่าเทียมกัน ในกรณีเช่นนี้ฝ่ายตรงข้ามสามารถบังคับให้ครึ่งที่ใหญ่กว่ายังคงอยู่

— โยชิโอะโอกาโมโตะ
แหล่งที่มา

1

Ah, interesting. Generalizing your example to

X = {a_{1}, . . ., a_{n}, b_{1}, . . ., b_{n}}

$X = \{a_1, ..., a_n, b_1, ..., b_n\}$ , it's clear that if the answer is

\forall i, \neg P (a_{i})

$\forall i, \neg P(a_i)$ and

\forall i, P (b_{i})

$\forall i, P(b_i)$ then we won't know it for sure until all

2 n

$2n$ nodes are queried. However, there are

2^{n + 1} - 1

$2^{n+1} - 1$ antichains (

2^{n} - 1

$2^n-1$ non-empty subsets of

a_{i}

$a_i$ 's, idem for

b_{i}

$b_i$ 's, and the empty set), so the bound is not tight by a factor of 2. Thanks for this example. However, I don't really see how/if the gap could be more than a multiplicative factor, or if a non-trivial upper bound can be found, let alone an algorithm for an exact answer.

— a3nm

7

In their paper Every Poset Has a Central Element, Linial and Saks show (Theorem 1) that the number of queries required to solve the ideal identification problem in a poset $X$ is at most $K_0 \log_2 i(X)$ , where $K_0 = 1/(2 - \log_2(1 + \log_2 5))$ and $i(X)$ is the number of ideals of $X$ . What they call an "ideal" is actually a lower set and there is an obvious one to one correspondance between monotonic predicates and the lower set of the points at which they don't hold, besides their "identification problem" is to identify by querying nodes just like in my setting, so I think they are dealing with the problem I'm interested in and that $i(X) = N_X$ .

So, according to their result, the information-theoretic lower bound is tight up to a relatively small multiplicative constant. So this basically settles the question of the number of questions required, as a function of $N_X$ and up to a multiplicative constant: it is between $\log_2 N_X$ and $K_0 \log_2 N_X$ .

Linial and Saks quote a personal communication by Shearer to say that there are known orders for which we can prove a lower bound of $K_1 \log_2 N_X$ for some $K_1$ which is just slightly less than $K_0$ (this is in the spirit of Yoshio Okamoto's answer who tried this approach for a smaller value of $K_1$ ).

$X$ $N_X$ $X$

— a3nm
แหล่งที่มา

5

$(\{0, 1\}^n, \leq)$ $(2^S, \subseteq)$ of all subsets of an n-element set), the answer is given by Korobkov and Hansel's theorems (from 1963 and 1966, respectively). Hansel's theorem [1] states that an unknown monotone Boolean function (i.e., an unknown monotone predicate on this poset) can be learned by a deterministic algorithm making at most $\phi(n) = \binom{n}{\lfloor n/2 \rfloor} + \binom{n}{\lfloor n/2 \rfloor + 1}$ queries (that is, asking $\phi(n)$ questions in the worst case). This algorithm matches the lower bound of Korobkov's theorem [2], which says that $\phi(n) - 1$ queries do not suffice. (So Hansel's algorithm is optimal in the worst-case setting.) An algorithm in both statements is understood as a deterministic decision tree.

The logarithm of the number of antichains in $(\{0, 1\}^n, \le)$ is asymptotically equal to $\binom{n}{\lfloor n/2 \rfloor} \sim 2^n / \sqrt{\pi n / 2}$ , so there is a constant-factor gap between $\log N_X$ and the optimal algorithm performance $\phi(n) \sim 2 \binom{n}{\lfloor n/2 \rfloor}$ for this poset.

Unfortunately, I have not been able to find a good treatment of Hansel's algorithm in English available on the web. It is based on a lemma that partitions the n-cube into $\phi(n)$ chains with special properties. Some description can be found in [3]. For the lower bound, I don't know any reference to a description in English.

Since I am familiar with these results, I can post a description on arXiv, if the treatment in Kovalerchuk's paper does not suffice.

If am not much mistaken, there have been attempts to generalize Hansel's approach, at least to the poset $(E_k^n, \le)$ , where $(E_k, \le)$ is a chain $0 < 1 < \ldots < k - 1$ , although I cannot give any reference straight away. For the Boolean case, people have also investigated notions of complexity other than worst-case for this problem.

[1] G. Hansel, Sur le nombre des fonctions booléennes monotones de n variables. C. R. Acad. Sci. Paris, 262(20), 1088-1090 (1966)

[2] V. K. Korobkov. Estimation of the number of monotonic functions of the algebra of logic and of the complexity of the algorithm for finding the resolvent set for an arbitrary monotonic function of the algebra of logic. Soviet Math. Doklady 4, 753-756 (1963) (translation from Russian)

[3] B. Kovalerchuk, E. Triantaphyllou, A. S. Deshpande, E. Vityaev. Interactive learning of monotone Boolean functions. Information Sciences 94(1), 87-118 (1996) (link)

— dd1
แหล่งที่มา

Thanks a lot for this detailed answer! For the Boolean

n

$n$ -cube, cf <cstheory.stackexchange.com/q/14772>. I can read French but couldn't find Hansel's paper (should have been available on Gallica but this issue seems to be missing), I found relevant info in Sokolov, N.A. (1982), "On the Optimal Evaluation of Monotonic Boolean Functions", USSR Comput Math Math Phys, Vol 22, No 2, 207-220 (English translation exists). I'm interested about generalizations to other DAGs if you can find refs. Don't hesitate to reply by email (a3nm AT a3nm DOT net) if length limit is a problem. Thanks again!

— a3nm

You are welcome! Unfortunately, I do not know how to bound the algorithm running time in terms of output size. Korobkov's proof of the lower bound, for instance, does not give an answer to that question. However, I feel there may be a reference that is slightly relevant. I'll try to find some time time over the weekend and look for generalizations as well. At the same time, I'm not sure whether a closed English description of the Boolean case (these two theorems) is worth writing...

— dd1

@a3nm maybe the DAG case hasnt been considered in the literature? could it be harder than the boolean n-cube ordered by inclusion?

— vzn

@vzn I guess that at least some of the questions here are bound to be open. Even for a chain, it is not immediately clear how to generalize Hansel's algorithm.

— dd1

@a3nm it all seems to be similar to finding lower bounds/minimal monotone circuits (sizes) but havent seen it clearly linked so far...

— vzn

0

[NOTE: The following argument doesn't seem to work, but I'm leaving it here so others don't make the same mistake / in case someone can fix it. The issue is that an exponential lower bound on learning/identifying a monotone function, as below, does not necessarily contradict an incrementally polynomial algorithm for the problem. And it is the latter which is equivalent to checking the mutual duality of two monotone functions in poly time.]

I believe your conjecture on $\log N_X$ is false in general. If it is indeed the case that $\log N_X$ queries are needed, that implies quite a strong lower bound on learning monotone functions using membership queries. In particular, let the poset $X$ be the Boolean cube with the usual ordering (if you like, $X$ is the powerset of $\{1,...,n\}$ with $\subseteq$ as its partial order). The number $M$ of maximal antichains in $X$ satisfies $\log M = (1 + o(1))\binom{n-1}{\lfloor n/2 \rfloor}$ [1]. If your idea on $\log N_X$ is correct, then there is some monotone predicate on $X$ that requires essentially $\binom{n-1}{n/2} \approx 2^n$ queries. In particular, this implies a lower bound of essentially $2^n$ for the complexity of any algorithm solving this problem.

However, if I've understood correctly [which I now know I hadn't], your problem is equivalent to checking the mutual duality of two monotone functions, which can be done in quasi-polynomial time (see the intro of this paper by Bioch and Ibaraki, which cites Fredman and Khachiyan), contradicting anything close to a $2^n$ lower bound.

[1] Liviu Ilinca and Jeff Kahn. Counting maximal antichains and independent sets. arXiv:1202.4427

— Joshua Grochow
แหล่งที่มา

Josh, I don't see a problem with the

\log N_{X}

$\log N_X$ argument. my understanding is that it is open whether a monotone function can be learned in time polynomial in

n

$n$ and the number of minimal elements. the Bioch-Ibaraki paper is about incrementally polynomial algorithm

— Sasho Nikolov

Ah, okay. I wasn't aware of that. (Like I said, I'm not an expert in this area - my answer was just based on looking up a few things and putting them together.) I'll leave it here so other people can see it and at least not make the same mistake / at best turn it into something useful.

— Joshua Grochow