Home · Developer FAQ

What our crew keeps answering in the dev chat

Updated 27 May 2026 10 min read

By Raul Moriarty ·Poker Software Expert

Every couple of months a new developer joins the project and asks the same questions. We pinned the answers. Twenty-one of them: solver compilation, opponent-model convergence under anonymous tables, latency budgets that bit us last year, detection topology, and the LLM questions we keep getting that we have to politely shut down.

What this FAQ covers

Solver theory and compilation: GTO baselines, action and state abstraction, multiway compression.
Opponent modelling: convergence under anonymous tables, Bayesian priors, archetype bucketing.
Engineering: latency budgets, multi-table state isolation, UI automation, mobile-vs-desktop tradeoffs.
Detection: behavioural shaping, action-timing distributions, adversarial-classification frame.
The open frontier: where the research is moving in 2026.

Spectrum of poker AI from solved to open: heads-up limit and preflop ranges are solved, 6-max NLH is near-solved, deep multiway turn play and ICM-heavy MTT endgames remain open research frontiers. — What's settled versus where the 2026 research actually is.

01 What is the difference between a GTO solver and a real-time poker bot?

A GTO solver calculates a pre-determined strategy for a particular game tree (preflop range, a given board, a given betting options), through iteration of either CFR or a variant thereof — PioSolver, GTO+ for Heads Up / Six Max and MonkerSolver for other situations. It creates a large "lookup" table with frequencies for each possible action. A real-time bot uses this table as a base and can quickly query (under 1ms latency) it to determine what action(s) best fit in current live-game conditions. In addition to using the results from the GTO Solver to provide action choices, the bot will overlay an opponent modeling system that identifies an exploitable weakness in how the GTO Solver has generated actions. That will cause the bot to move its chosen action further away than would be otherwise optimal from those provided by the GTO Solver. The solver is the theory book; the bot is the actual player.

02 How does opponent modelling work in heads-up vs 6-max?

The difference in the model architectures depends upon how many players are at the table. In heads up (HU), it uses a fine grained model for each board texture to be able to process all of an opponents actions throughout the session as well as to update the same opponent's probabilities during every hand. In 6 max, there are five other opponents. Each of those five opponents can only have their model updated once each time he enters a hand. As such the model has to create broad categories of behavior for all of the other players at the table (i.e., loose passive, tight aggressive etc.) and then refine which category fits best for each of them over the course of the session.

03 What stakes are economically viable for AI decision engines after rake?

Low-stakes rake is primarily a rake-related expense. At GGPoker, rake is 6-8 percent on NL10 six-max, which is significantly mitigated by rakeback; it is roughly 4-5 percent on NL50 and about 3 percent net on NL200. Additionally, as you move up to higher stakes, your expected value per hour decreases due to regs being increasingly aware of solvers -- therefore, your EV-per-hour graph will likely peak in the range of NL25-NL100 based on the specific game format. The mathematical model for spin & gold is fundamentally different from other formats since the rake structure is flat and the variance is very skewed forward — the €5 and €10 buys have a positive expected value when played frequently enough, however they take thousands of hands to achieve this.

04 How do detection systems differentiate human reaction time variance from bot timing?

By shape, not by mean or variance alone. Human action-timing distributions are log-normal-ish with state-dependent location and heavy right tails — snap-folds in 600-1200ms, hard rivers in 5–30 seconds, occasional 8–25 second "distraction" tails that are state-independent. A bot that adds uniform noise around a centroid has the wrong distributional shape. A bot that draws from a fixed log-normal also fails because the location parameter needs to condition on decision difficulty. The full discussion is in the detection note.

05 Why is preflop play not "solved" the way endgame chess is?

Two reasons, both rooted in branching factor. Chess has roughly 35 legal moves in a typical position; No-Limit Hold'em has a continuous bet-size space layered on top of fold/check/call/raise, so the action set is effectively unbounded. To solve it you have to discretise bet sizes into a finite abstraction, and every abstraction throws away information. Second, poker is a game of imperfect information — you never see the opponent's cards — so the "solution" is a probability distribution over actions at every node rather than a single best move, which is a fundamentally larger object to compute and store than a chess tablebase.

06 What is the role of LLMs in poker decision-making?

None (or virtually none) for real-time decision making; some utility for subsequent hands or adjacent hands. Frontier LLMs will generate range estimates that do not relate to ICM, they will make poor use of prior knowledge about relative frequencies, and there is little-to-no understanding of how to properly size bets based upon the current game state. Frontier LLMs can be beaten by a $5 buy-in heads up bot built off solver output. Where they do help is off the table: annotating hand histories, generating candidate exploit hypotheses worth solver-checking, producing synthetic training data, and summarising solver output into plain language. The gap between "useful hand-history annotator" and "useful in-the-loop player" is far wider than most people assume.

07 How fast does an online opponent model need to converge under anonymous tables?

The empirical floor for a useful exploitable deviation in joint observations is about 80–150 hands of play — sufficient to estimate VPIP and PFR to within ±5 percent with confidence intervals overlaid on c-betting responses and 3-betting frequencies. In theory a Bayesian prior that conditions on population characteristics can reduce this to 40–60 hands of play; however, there is currently an open research question as to how stable the prior will remain when moving from a fish vs reg bimodal distribution (as present at older European poker sites) to the much tighter bimodal distribution found at GGPoker.

08 How does multi-table state isolation work without leaking timing?

Each table runs its own state machine with its own decision queue. A naive implementation shares a CPU and clocks decisions sequentially, which creates observable cross-table action latency correlations — table A's hard river decision delays table B's snap-fold, and the correlation is detectable. The right architecture is preemptive: a tough decision on table A surrenders the CPU and rejoins the queue when easier decisions are pending, with each table's behavioural-timing distribution sampled independently. Production systems also add a per-table "alertness" parameter that drifts down over long sessions to mimic human fatigue.

09 What latency budget per action is realistic on mobile?

On a modern Android device (Snapdragon 8 Gen 2 class), 30-80ms for solver-table lookup, 5-15ms for opponent-model update, 10-30ms for UI inspection on the rendered client. Total compute budget around 100-150ms per action. The human-perceptible action latency is the behaviourally-shaped sample on top of that, which is typically 600ms to several seconds. The compute budget is comfortable; the constraint is power consumption over long sessions.

10 Why is the multiway turn one of the hardest spots to automate?

Three streets of betting, a board that has revealed most of its information, and 2-4 opponents whose ranges have narrowed materially. The game tree under reasonable bet-sizing abstraction blows up faster than any other street. MonkerSolver's depth-limited subgame solving handles the math but only for specific tree shapes; producing a general-purpose multiway turn policy that compiles to a mobile inference budget is genuinely open. The current production approach is opportunistic — solved for high-frequency tree shapes, fall back to heuristic-plus-opponent-model for the rest.

11 Why is Spin & Gold a separate engine from cash NLH?

Three-handed hyper turbos have completely different math from 6-max cash: 25bb starting stacks, super-fast blind levels, multiplier-aware ICM at the high end (a 1000x spin has different bubble dynamics than a flat 2x). Push-fold charts dominate from about 12bb down; the equilibrium changes shape entirely. Using a cash engine on a Spin table loses money fast because the cash engine over-deepens its lines into stack depths that do not exist.

12 What changes in PLO compared to NLH from a bot-architecture perspective?

Equity distributions flatten because each player has six two-card combinations rather than one — preflop equity differences between hands shrink, blockers matter more, and turn play has higher EV-per-decision than river play. The solver outputs are larger and harder to compress, opponent-modelling features differ (3-bet frequency by hand class is less informative; reraise frequency given board texture is more), and the UI automation layer needs to handle four hole cards instead of two. The same engineering scaffolding applies; the strategic content is genuinely different.

13 How do you compile a Pluribus-scale solver output into a mobile inference budget?

The two compression methods are as follows. First, state abstraction, i.e. compressing the game state into smaller equivalence classes based on the texture of the board, the stack to pot ratio, position, and the previous actions taken in order to reduce the lookup key from the full state space down to an integer index. Next, action abstraction - discretising the possible bet sizes down to a finite set (for example 33%, 66%, 100%, 150% etc.) and then solving for that abstraction level. Overall, the two together usually provide a compressed version of the data which is about four to five orders of magnitude less than the original CFR data, while introducing some EV loss which is relatively bounded compared to using no compression at all.

14 How does anti-fingerprinting differ on Android versus iOS?

Android is the production target because the application sandbox is permissive enough to support accessibility-service-driven input automation with relatively stable behavioural shaping. iOS is significantly harder because the sandbox blocks the input-automation surface and the available alternatives (jailbreak, MDM exploitation) are unstable across OS versions. Most production poker AI runs Android-only as a result, which has the side effect of concentrating detection-side analysis on Android-typical telemetry — a small but useful population shift.

15 What signals does the operator collect that a bot author cannot easily spoof?

Hardware level entropy is the most difficult for an operator to create. Sensor information can be collected by an operator for a mobile device (accelerometer drift, gyroscope movement while gaming) as well as information about the CPU's clock frequency variation and the discharge rate of its battery. For example, a device that sits on a desktop and runs for 12 hours has a measurable difference in its sensor readings than does a device being carried around by a person. One implementation includes placing the device into a powered cradle and adds low level perturbations to the sensor reading programmatically, this may counteract the inexpensive version of this test but it is unclear if it will prevent an advanced operator using side channel analysis.

16 How are bot bans typically structured in time?

Not as an immediate ban-hammer. The visible sequence runs: quiet flag → soft restriction (withdrawal limits drop, KYC re-verification requested) → structured interview from support → confiscation and closure. The cycle takes 2 weeks to 9 months depending on review-queue capacity and triggering events; the most common accelerator is a large first withdrawal. The detection page covers the review pipeline in more depth.

17 Are there published academic benchmarks for poker AI?

Yes, though they lag the commercial frontier. Classical benchmarking ran through the Annual Computer Poker Competition and its successors; agents like Tartanian, Slumbot, and Pluribus were evaluated against fixed opponent pools. Academic work today sits mainly in the Brown–Sandholm lineage at Carnegie Mellon and the DeepStack / DeepMind directions. Over 2023–2026, commercial poker AI has pulled well ahead of any public benchmark, because commercial systems are tuned against real player populations rather than research baselines.

18 How does the engineering map to actual EV in dollar terms?

Rough Estimate: Solver Base Line provides an almost zero EV strategy against solid opposition; a positive EV strategy against your overall population. Modeling of your opponent will add 1-4 BB / 100 vs. average opponents based on their exploitability. Behavioral shaping with detection awareness will cost you 0.5-1.5 BB / 100 over pure GTO output. Net result will be 2-5 BB / 100 at NL25-NL100 on GGPoker, which scales with HPPH (60-80 per table, multi-tabling) and Effective Stake. All numbers depend on your pool and decrease as the regulars in your pool become aware of improved solver work from them.

19 What is the most under-rated component of a production poker AI?

Reliability in UI Automation. The really cool stuff is going to be in solvers and opponent models; the breaking point for the product has been nearly every time at the screen scrape and action emit layers. GGPoker sends out client updates 4-6 times per year, with about 50% of those breaking some part that the input layer is dependent upon. The development effort to ship all the non-interesting parts of this system each year is far greater than the amount spent trying to improve the more interesting mathematics.

20 Where does the research frontier sit in 2026?

There are three places where this work is being done. Online opponent model convergence under an anonymous identity rotation. Detection aware policy combiners in the adversarial classification frame. Compiling Pluribus scale multi-way output into mobile inference budget with no acceptable EV loss. All of these are open problems; they have been open long enough for a serious contributor to make real progress. There is also a team chat where some of the discussion occurs.

21 How do you get involved or contribute?

Please take some time to read the articles listed at the bottom of this webpage (the detection page) along with information from the home page (the open questions). If you are able to share an example of how you've used any of these open issues in your own work or have access to related data, please feel free to use the chat function at the bottom of this article. This chat is generally quiet as it is monitored solely by the development team; all “can I buy your bot?” type messages will be automatically archived.

Question we didn't cover?

Ask the team in the chat. The FAQ is updated when a new question gets asked twice.

Contact the team