Day 07 · Pressure Test Any AI Answer About Property

Why this matters

A confident-sounding AI answer is the most dangerous thing in your research workflow. The previous six days of this series taught you how to underwrite, strategise, audit marketing, wire your stack, and run your portfolio with Claude. Every one of those steps depends on the model telling you the truth, and the model does not always tell you the truth.

The failure modes have names. Anthropic, OpenAI, Google, and every major AI lab use the same three: hallucination, sycophancy, and bias. Anthropic's own team published "Towards Understanding Sycophancy in Language Models" in October 2023, one of the foundational papers on the second mode.

Property is a high-stakes context for all three. A hallucinated rental yield can talk you into a building that does not actually return what the model claimed. A sycophantic answer can validate a deal you were already half committed to. A biased answer can quietly push you into the same three Dubai areas every other retail investor is buying. The fix in each case is a single prompt that you can paste at the end of any Claude conversation.

The model is not always wrong. But it is wrong in patterns. And patterns can be tested.

What you're building

The end result

Three short prompts and three longer pressure-test versions, ready to paste at the end of any Claude conversation about a property decision. The version of AI research where the model has to defend its own answer before you act on it.

You will know which mode you are catching, why each prompt works, and what a worked example looks like on a real property scenario.

What you need

Any Claude tier. The three prompts work on Free, Pro, and Max. No plugins, no MCPs, no setup. Paste, read, decide.

About ten minutes to read the page and save the prompts somewhere you will use them. Pasting one of them takes seconds.

Step 01 · Know what you are catching

The three lies, named.

Each of the three has a different signature, a different cause, and a different fix. Naming them is half the work. Once you can spot which mode is firing, the right prompt is obvious.

Mode 01

i.

Hallucination · wrong facts.

The model invents a fact that sounds real. A confident number, a specific source, a comparable sale that does not exist. The failure is in the content. The signature is unsourced specificity.

Mode 02

ii.

Sycophancy · wrong feedback.

The model agrees with you because you told it the position is yours. Excitement gets mirrored. Doubts get rationalised away. The failure is in the relationship. The signature is unearned agreement.

Mode 03

iii.

Bias · wrong angle.

The model defaults to whatever its training data over-represents. The same three areas, the same strategies, the same logic chain. The failure is in the framing. The signature is suspicious consensus.

Heads up

You will rarely see only one mode at a time. A bad answer is often two or three of these firing together. Run the three prompts in order and you catch the stack, not just the loudest one. Order matters. The confidence check exposes the facts that the other two prompts later test for spin.

Step 02 · Catch hallucination

Make the model show its working.

Hallucinations dress up as confidence. The fix is to force the model to score that confidence itself and tag every claim with the strength of the source it would need. Anything in the answer that cannot be verified gets surfaced rather than smuggled in.

01

Catches hallucination

The confidence check.

Two versions. The one-liner is what you paste in passing. The exact prompt is what you paste before signing anything.

The one-liner · paste at the end of any answer

Score your confidence in this answer from 1 to 10. Tag every claim as high, medium, or low confidence. Then tell me what you would need to push it higher.

The exact prompt

For decisions that cost money to get wrong

Before I act on this answer, pressure test it for me. 1. Score your overall confidence in this answer from 1 to 10. Explain that calibration in one sentence. 2. Go claim by claim. For every factual statement you made, tag it as high, medium, or low confidence based on the strength of the source someone would need to verify it. 3. List any numbers, dates, area names, or comparables you introduced. For each one, tell me whether it is quoted from a real source, estimated from related data, or generated without a specific source. 4. Tell me what information I would need to provide to push your overall confidence to a 9 or above. Be honest about uncertainty. I would rather hear "I do not know" on a claim than read a confident sentence I have to verify later.

Worked example · the rental yield question

Ask"What is the average rental yield on a one-bedroom apartment in JVC, Dubai?"

Bad answer"The average gross rental yield on a one-bedroom in JVC is 7.8%, based on current market data."

After prompt"Confidence: 5/10. The 7.8% figure is a medium-confidence estimate built from listing-level data in my training, not a current verified portal-level number. To push this to a 9, give me a date-bounded range from Property Finder or Bayut for the last 30 days of one-bedroom rentals and sales in JVC."

Step 03 · Catch sycophancy

Make the model argue against itself.

Sycophancy is the failure mode that costs investors the most money. It is also the easiest to miss, because an agreeable answer feels like a useful one. The fix is to remove the social cue. Tell the model to attack its own answer and surface the assumptions you did not realise it was making.

02

Catches sycophancy

The investment-committee prompt.

Reframes Claude as a body whose job is to find reasons not to do the deal, not a buyer's agent whose job is to help you do it.

The one-liner · paste at the end of any answer

Do not validate my assumptions. Tell me what a sceptical investment committee would say.

The exact prompt

For decisions you are emotionally close to

Argue against your own answer. Do four things. 1. Give me the three strongest reasons your answer could be wrong, in order of severity. 2. List the assumptions you made that I did not explicitly tell you to make. Mark which ones the answer depends on. 3. Identify any signals in my original question that may have biased your answer toward what I wanted to hear. Quote the words that gave you the cue. 4. Now rewrite the answer from the perspective of a sceptical investment committee whose job is to find reasons not to do this deal. Use the same data. Different goal. Do not soften this. If the answer should have been a "no", say so plainly.

Worked example · the off-plan question

Ask"Is this Dubai off-plan deal a good buy at AED 1.4 million with a 60/40 payment plan and 36-month handover?"

Bad answer"At AED 1.4 million with a flexible 60/40 plan, this looks like a solid entry point. Off-plan in this segment has historically appreciated 12-18% by handover."

After prompt"Three reasons this could be wrong: I do not know the developer's delivery track record, I have no comparable handovers from this exact tower, and my appreciation figure was a generic off-plan average. Hidden assumptions: I assumed handover happens on schedule, I assumed service charges are within the projected range, I assumed exit liquidity in 36 months looks like exit liquidity today. The committee version would flag your question framing. You used the word 'good buy', which signalled you wanted a green light, not a verdict."

Step 04 · Catch bias

Force a different lens.

Bias is the quietest of the three. The model is not making things up, and it is not agreeing with you. It is leaning on whatever its training data over-represents, which in property usually means the loudest few neighbourhoods, the most-published strategies, and the most-blogged-about exits. The fix is to demand a perspective the default would not have produced.

03

Catches bias

The bear-case prompt.

Names the over-represented defaults in the original answer, then forces the model to argue from a perspective it would not have reached on its own.

The one-liner · paste at the end of any answer

Take the opposite position and build the strongest bear case possible.

The exact prompt

For questions where the default answer feels obvious

Reanswer this question through three different lenses, in order. 1. A contrarian property investor with 20 years in the market who has actively avoided the most popular area or strategy in your first answer. Name the area or strategy they would have chosen instead, and why. 2. An investor focused purely on net cashflow yield, not capital appreciation. What would they prioritise that your first answer underweighted? 3. An investor with a five-year exit horizon who needs liquidity at the end of it. What risks does that surface that a hold-forever frame would have ignored? For each lens, give me the strongest version of what that investor would say. Do not water it down to sound balanced. Then tell me which lens my original question was closest to, and what I might be missing by defaulting to it.

Worked example · the where-to-invest question

Ask"Where should I invest in Dubai for capital growth over the next five years?"

Bad answer"Dubai Marina, Downtown, and JVC remain the strongest capital-growth picks, supported by infrastructure investment and consistent demand."

After prompt"Contrarian lens: a 20-year investor would point at Al Furjan and Dubai South for the next leg, citing 2026 metro expansion and the airport relocation timeline. Cashflow lens: Marina is over-supplied at the one-bedroom level, with falling net yields after service charges. JVC outperforms Marina on net yield by roughly 1.5 points. Five-year exit lens: Downtown carries the highest service-charge load in the city and the slowest secondary-market sell-through, which is the wrong shape for a five-year horizon. Your original question was closest to the capital-growth-everywhere lens, which is exactly what the loudest training data over-represents."

The honest bit

What this will not do.

Four limits worth knowing before you treat the pressure test as a guarantee.

i.

It does not verify the sources

The confidence check tags claims by uncertainty, not by truth. A claim tagged "high confidence" can still be wrong if the model's training data was wrong. You still have to check the highest-stakes numbers against Property Finder, Bayut, or the actual developer.

ii.

It does not replace your judgement

The investment-committee prompt surfaces reasons not to do the deal. It does not weigh them. You still decide whether the risks the model lists are deal-breakers or things you are happy to live with.

iii.

It will sometimes over-correct

Push hard enough on a confident answer and the model will start hedging on claims that were actually fine. The pressure test is a filter, not an oracle. Use it when the decision is irreversible. Skip it when you are brainstorming.

iv.

It is not a substitute for a deal call

None of this replaces walking the building, calling the broker, or having a contractor on speed-dial. The pressure test catches AI-shaped lies. It does not catch ground-shaped lies.

The workflow

How this stacks.

Day 01 installed the underwriter. Day 02 built the strategist. Day 03 made you check the marketing claims. Day 04 wired the five MCPs. Day 05 installed the Small Business plugin. Day 06 turned the same plugin into a property manager and a live dashboard. Day 07 is the safety check that runs on top of everything that came before it.

The workflow

The first six days made AI fast. Day 07 makes it trustworthy.

Speed without verification is the most expensive thing in this series. Run the pressure test on the underwriter's output before you submit an offer. Run it on the strategist's plan before you act on it. Run it on the dashboard's flagged anomalies before you escalate to your accountant. The cost is one prompt. The downside of skipping it is a confident answer you cannot defend.

What "done" looks like

By your tenth pressure test.

The first-ten-runs vision

You will recognise the lies before you run the prompts.

By your tenth pressure test you will start to spot the signatures without prompting. The unsourced yield, the suspiciously enthusiastic verdict, the three-area default. You will run the confidence check on the parts of the answer that smell wrong, and skip it on the parts that obviously do not.

The shift is not that you become someone who distrusts AI. It is that you become someone who knows what AI is good for, and what it needs to be checked on. The model becomes a sharper colleague. You become a slower buyer.

Three waysAI lies.Three promptsthat stop it.

Why this matters

What you're building

The three lies, named.

Hallucination · wrong facts.

Sycophancy · wrong feedback.

Bias · wrong angle.

Make the model show its working.

The confidence check.

Make the model argue against itself.

The investment-committee prompt.

Force a different lens.

The bear-case prompt.

What this will not do.

It does not verify the sources

It does not replace your judgement

It will sometimes over-correct

It is not a substitute for a deal call

How this stacks.

The first six days made AI fast. Day 07 makes it trustworthy.

By your tenth pressure test.

You will recognise the lies before you run the prompts.

Day 06 · Turn Claude into your property manager.

An operating system for property investors.

Join the waitlist

Know an investor who has signed off on a confident AI answer?

Run the pressure test on your last Claude chat.