Caleb Stanford Blog


Dominion Online tournament 2019 (Round 2)


dominion

I’m participating in the Dominion Online tournament again this year. I didn’t post about Round 1 this year, but here are Round 1 and Round 2 from last year.

For each of the 4 games, I provide the kingdom, and the puzzle is to guess the most valuable cards on the board (before looking at the spoiler), i.e. which cards are most important to the best strategy. For reference, a list of all Dominion cards can be found here.

These boards were really cool! At least two of them involved strategies I had never seen before, and none of them involved a totally standard or obvious strategy.

Game 1

Kingdom 1

Spoiler:

Most valuable card(s): City Quarter and Leprechaun. Honorable Mention: Pixie.

This board has everything we need to build an engine, with only one problem: gains. City Quarter on its own is plenty of draw and actions, and it's even better because we can set up the start of our turn with Dungeon. But to build the engine, and to make effective use of it at the end, we're going to need more than one gain per turn. That's where Leprechaun and Pixie come in.

Leprechaun is not a good buy early on (Gold clogs up our deck and makes it difficult to get the engine going), but it becomes of critical importance once we can activate it every turn. (Sadly, there's no way to activate it twice.) Wish can be used to pick up engine components or Duchies -- against an opponent who is only buying one card per turn, that is very strong.

Pixie is also really good here. Because gains are limited, we wait until we hit Earth's Gift, Forest's Gift, or Swamp's Gift. In my deck, I had about $14 in my deck after Leprechaun had activated for a few turns -- this is overkill because only $8 of it can be used. That means that if we get Forest's Gift once, we can double province, which is likely game-deciding.

The alternative to all of this is to play Dungeon Money, perhaps with a Leprechaun or a Tormentor and some Pixies and Ironmongers. But the money deck can only get 1 province per turn, so it should lose to the engine.

Game 2

Kingdom 2

Spoiler:

Most valuable card(s): Remodel, Lackeys, and Market Square.

The speed of a deck based almost entirely around these three cards really surprised me. I was able to pile the provinces on T13 and didn't run out of villagers.

So what's the thought process here? We know there are no villages other than Lackeys, so we can't hope to build an engine that lasts for more than a couple of turns. We also know that the only trashing is from Remodel, which looks very strong -- remodeling Estates is good as always, and here we can also remodel Copper to Lackeys.

The next thing we notice is that Market Square combos well with Remodel for two reasons: first, Remodel activates Market Square to give us Golds, and second, Remodel loves Gold because it can be turned into Province to score a lot of points quickly.

I ended up spamming a *lot* of Lackeys. On most boards, that doesn't really work; you go through villagers and quickly run out of steam. Here, we can remodel some Lackeys into more Remodels instead of playing them, so that we have 3-5 Remodels in deck. In the process we get a bunch of golds with Market Square. Finally, we can empty the provinces by remodeling Gold into Province and/or Province into Province several times over a few turns.

Cobbler is also good; it can gain any of the key deck pieces. I opened with it on 5/2.

Game 3

Kingdom 3

Spoiler:

Most valuable card(s): Candlestick Maker and Conquest. Honorable Mention: Legionary and Cursed Village.

What I learned on this board is that Conquest loves coffers. What is normally a pretty weak alt-VP card becomes very strong if you can stockpile coffers, and then spend them all on a big turn (in my case, spending 30 coins for 5 conquests, which is 2 + 4 + 6 + 8 + 10 = 30VP). I think you should be able to ignore Province entirely -- I made the mistake of buying a few Provinces before Conquesting, which caused me to almost lose.

Thought process: Legionary is a strong attack, which means the Engine is probably favored if it manages to play Legionary every turn. Therefore we want to build an engine, but we see that the only +actions is from Cursed Village, and there aren't any good cantrips. That means the engine has to play with Cursed Village, so it's basically a draw-to-X deck. Hireling doesn't really help at all in such a deck, but Candlestick maker works very well, as does trashing with Raze and playing 1-2 Legionaries for economy. Also, we can use Gear to accelerate hitting price points early on, and later to set aside cards in order for Cursed Village to draw.

Such a deck wouldn't normally be very good, because Candlestick Maker is just not a lot of economy. But if this deck is not mirrored, you can get the entire stack of Candlestick Makers and then stockpile coffers into an arbitrarily large Conquest turn. Plus Legionary every turn really slows your opponent down. FWIW, this game lasted 23 turns, which is quite long, and that was with me making the mistake of getting a few Provinces. (The reason you don't want Provinces as the engine player is that it makes the game end faster -- and the longer the game goes on, the bigger your eventual Conquest turn.)

Game 4

Kingdom 4

Spoiler:

Most valuable card(s): City Quarter and Storeroom.

City Quarter combos well with any discard-for-benefit card -- e.g. Artificer, Vault, and Storeroom. (Incidentally, another discard-for-benefit card is present here, Mill. But it doesn't combo nearly as well as these other cards.) The idea is that you discard half of your hand, leaving only actions, then play CQ to draw everything up again, and repeat. The City Quarter/Storeroom deck scales economy a lot faster than e.g. just playing with City Quarter, Wharf, and Gold.

Wharf is also great, because it sets up the start of turn for City Quarter (to make sure we don't dud). Hideout and Lookout are both good trashers; I'm not sure which is better, but I think you get two total. Finally, Dismantle (targeting Gold) is a good gainer; I used it to gain Wharves.

Big Data Health Monitoring Should Be Mainstream Already


research health

The healthcare system in the US is often rightfully criticized for the cost to patients: insurance costs, “out-of-network” costs, and so on. But in my own interactions with healthcare over the last year or two, I’ve been equally frustrated by something else: medical data is not being sufficiently gathered and utilized. In the age of big data, medicine seems to be many years behind:

  1. A large amount of medical data is generated in current healthcare routine, but outside of scientific studies it is only utilized on a per-patient basis.

  2. An even larger amount of medical data could be easily gathered, through patient-recorded logs and wearable monitoring devices, but is not being gathered or utilized.

This is sad to me, because if we’ve learned anything in the last 10 years, it’s that data is extremely valuable when used in aggregate, at maximum scale, to update models and make predictions. In particular, we are probably missing out on more sophisticated, more accurate, and more proactive health monitoring.

I’m not suggesting that computers should replace humans as doctors; at least, not anytime soon. Although algorithms can analyze data faster and at a larger scale than humans, they probably can’t replace doctor expertise in the short term, especially with interactive tasks (like deciding what questions to ask in response to patient symptoms, or determining what data would be most valuable to collect).

What I am suggesting is that we: (1) gather more data, (2) make inferences at a larger scale from a larger amount of data, that can be applied in a more individualized fashion. It was encouraging to read that at least one researcher is taking this approach.

Can you be more specific about “medical data”?

Medical data points are taken constantly, even under the current system. If you go in for a doctor visit, several data points are gathered: the problem or symptoms you have (or lack thereof), as well as basic vital signs (pulse, blood pressure) and sometimes blood tests, urine tests, etc. Additionally, many people take data measurements at home, e.g. fitness trackers, sleep trackers, and blood pressure measurements; and people often notice symptoms and record them (e.g. stomach ache today, chest pain or back pain, feeling lousy in general, etc.). Many people would also be happy to record and track more data. I personally keep a log of minor health problems — just in case some pattern emerges, I suppose.

How is the way we currently use medical data insufficient?

It seems to me that current medical data is not used in aggregate to update medical models and improve future predictions. In fact, my doctors have not looked at my individual data points at all. Instead, you go in for a sick visit and they listen briefly to your array of symptoms, or you go in to get help with a specific thing like sleep, or you go in for a yearly checkup and they review any problems or questions you have. Then, the doctor (a human) makes a prediction or diagnosis about a specific issue, but does not compare with your past data in aggregate and has often forgotten about other minor or major problems you may have experienced. Additionally, many minor issues are ignored if they cannot be easily diagnosed. Finally, doctors do not apply the information about your case to future cases; thus, revision to standard practices only happens over a longer period of time through medical research and controlled studies.

The result of this limited use of health data is that while doctors are good at diagnosing specific sicknesses and medical problems, they are not good at overall monitoring “healthy” individuals to make sure they continue to stay healthy, without risk for future problems. Such monitoring would require a much more subtle understanding of our medical data; an understanding that is evidently not possible using the current approach.

But is there any guarantee that more aggressive data mining would be successful? Would models and predictions actually improve?

No strict guarantee, no. But from my perspective there is good reason to believe the answer to the second question is “yes”; all the recent breakthroughs in big data and machine learning have shown that with enough data and computational resources, it’s usually possible to outperform humans on concrete tasks (e.g. predicting future data from past data). These methods are only continuing to improve and be applied to various domains, not just computer science problems. So I am confident that machine learning algorithms will get better, and there is no reason that that will not extend to medical diagnosis and monitoring. However, there is one important precondition, and that is that machine learning algorithms require a lot of data to perform well.

Right now, no one has access to that kind of data. But the data exists and much of it is already collected, or else easy to collect. Perhaps in the near future, we can utilize it?

ProvableP vs ProvableNP


math logic-and-computation

Introduction and Background

The \(\mathsf{P}\) versus \(\mathsf{NP}\) question asks: is it possible to determine whether a Boolean formula is satisfiable, in time polynomial in the length of the formula? Determining whether a Boolean formulas is satisfiable is called the SAT problem, and programs which solve the problem are called SAT solvers. SAT solvers are used in many modern programming tools (usually, for automatically solving some system of constraints), so the SAT problem is of great practical importance today. And it is often suggested that a solution to “\(\mathsf{P}\) versus \(\mathsf{NP}\)” would resolve whether there exists any efficient algorithm at all for SAT.

There are a number of reasons why this suggestion could be wrong, or at least, a number of ways to argue against it. But here’s a particular thought: what if \(\mathsf{P} = \mathsf{NP}\) (that is, SAT can be solved in polynomial time), but the polynomial-time algorithm is beyond our ability to find – and in particular, beyond our ability to prove correct? What if there exists some magic algorithm that happens to work, but there doesn’t exist any proof that it works? Or perhaps there doesn’t exist any proof that it runs in polynomial time? Such a scenario would be an unsatisfying resolution to \(\mathsf{P}\) versus \(\mathsf{NP}\), in a way: because practically speaking, we generally only trust algorithms that we can prove correct.

The question I’m trying to motivate is: does \(\mathsf{ProvableP} = \mathsf{ProvableNP}\)? That is, what happens if we restrict our attention to algorithms which can be proven correct? Is this question answerable, or is it just as hard as \(\mathsf{P}\) versus \(\mathsf{NP}\)?

Summary

It turns out that \(\mathsf{P} = \mathsf{NP}\) if and only if \(\mathsf{ProvableP} = \mathsf{ProvableNP}\). I’ll define what I mean by \(\mathsf{ProvableP}\) and \(\mathsf{ProvableNP}\). This post is adapted from a question and answer I made on Computer Science StackExchange.

The ProvableP versus ProvableNP question

Fix a logic \(\mathcal{L}\) that is strong enough to encode statements about Turing machines. By this I mean the same requirements as Godel’s second incompleteness theorem; but for the purposes of this question, just assume the following:

  • \(\mathcal{L}\) is consistent (it doesn’t prove False); and

  • \(\mathcal{L}\) implies the Peano arithmetic axioms of \(\mathbb{N}\) (PA).

Then define

\[\begin{align*} \mathsf{ProvableP} &:= \{A \mid \mathcal{L} \text{ proves } [A \in P]\} \\ \mathsf{ProvableNP} &:= \{A \mid \mathcal{L} \text{ proves } [A \in NP] \} \end{align*}\]

There’s an issue with this definition: how do we encode languages \(A\) (of which there are uncountably many) as finite objects in order for the logic \(\mathcal{L}\) to have something to say about them? In particular, how do we encode formulas for \([A \in \mathsf{P}]\) and \([A \in \mathsf{NP}]\)? Let us assume that languages are subsets of the natural numbers \(\mathbb{N}\), and that any \(A\) in \(\mathsf{ProvableP}\) or \(\mathsf{ProvableNP}\) must be given by an arbitrary formula of the symbols of \(\mathcal{L}\), i.e. it is any definable subset of \(\mathbb{N}\).

From the definition we have that \(\mathsf{ProvableP} \subseteq \mathsf{ProvableNP} \subseteq \mathsf{NP}\). But does \(\mathsf{ProvableP} = \mathsf{ProvableNP}\)? How does this relate to the original \(\mathsf{P}\) vs \(\mathsf{NP}\) question?

The Answer

Obligatory warning: to my knowledge, this proof has not been rigorously checked by someone else :) But we now argue that

\[\boxed{\mathsf{ProvableP} = \mathsf{ProvableNP} \;\;\text{ if and only if }\;\;\mathsf{P} = \mathsf{NP}.}\]

For the forward direction, assume \(\mathsf{ProvableP} = \mathsf{ProvableNP}\), and consider any language \(A\) in \(\mathsf{NP}\). Note that \(A\) is accepted by a nondeterministic Turing machine \(N\), such that there exists a polynomial \(p(n)\) which bounds the longest execution branch of \(N\) on input a string of length \(n\). Then let \(N'\) be a nondeterministic Turing machine with the following description: first, count the input length \(n\); second, calculate \(p(n)\); and finally, runs \(N\) for at most \(p(n)\) steps nondeterministically. If \(N\) doesn’t halt (this never actually occurs, but \(N'\) doesn’t know that for sure), \(N'\) rejects.

Now observe that \(\mathcal{L}\) (more specifically, PA) proves that \(N'\) runs in time \(p(n)\). Moreover, \(\mathcal{L}\) can describe “the language of strings accepted by \(N'\)” by a formula (roughly, “there exists an sequence of configurations such that the sequence is a run of \(N'\) and ends in an accept state”). Then we have that \(\mathcal{L}\) proves \([L(N') \in \mathsf{NP}]\). Therefore, \(L(N') \in \mathsf{ProvableNP}\). But we know (even if \(\mathcal{L}\) doesn’t) that \(L(N') = L(N) = A\) by construction. So \(A \in \mathsf{ProvableNP}\). But \(\mathsf{ProvableNP} = \mathsf{ProvableP} \subseteq \mathsf{P}\), so \(A \in \mathsf{P}\).

The backward direction is a similar trick. Assume \(\mathsf{P} = \mathsf{NP}\) and that \(A \in \mathsf{ProvableNP}\). Then \(\mathsf{ProvableNP} \subseteq \mathsf{NP} = \mathsf{P}\), so \(A \in \mathsf{P}\). From here, we know there is some Turing machine \(M\) and polynomial \(p(n)\) such that \(M\) runs in time \(p(n)\) and \(L(M) = A\). Let \(M'\) be a deterministic Turing machine which, on input of length \(n\), first calculates \(p(n)\), and then runs \(M\) for at most \(p(n)\) steps. If \(M\) doesn’t halt, \(M'\) returns a default value, say \(0\).

Then similarly to before, \(\mathcal{L}\) proves that \(M'\) halts in time \(p(n)\) and therefore that \([L(M') \in \mathsf{P}]\). it follows that \(L(M') \in \mathsf{ProvableP}\), but we know (even if \(\mathcal{L}\) doesn’t) that \(L(M') = L(M) = A\). Thus, \(\mathsf{ProvableNP} = \mathsf{ProvableP}\). \(\quad \square\)

« Newer Posts Older Posts »