I Built an AI That Called 2,000 London Cafes to Find Out Which Ones My Buggy Can Get Into

There are 3.57 million children aged 0 to 4 in the UK. Around 250,000 families welcome a new baby in London every year. Every single one of them will, at some point, try to get a pushchair through a cafe door and discover whether it fits. And there is no data anywhere that tells them the answer in advance.

I am a dad in London with a two-year-old. I got tired of guessing. So I built an AI voice agent called Buggy Smart that phones venues and asks one question: are you pram-friendly? It has now made 2,093 calls across 12 London boroughs. Here are the ten things I learned.

1. Nobody has ever collected this data

Google tracks reviews. TripAdvisor tracks ratings. Yelp tracks opening hours. Not a single platform tells you whether your pushchair fits through the door. This is not a gap in the data. It is a category that does not exist. Every parent navigates this problem through trial and error, word of mouth, or just hoping for the best.

The numbers make it obvious. 3.57 million children under five in the UK. A quarter of a million new London families each year. Zero systematic data on which venues their buggy can actually get into. Every other consumer decision has been mapped, rated and optimised. This one is still guesswork.

That is what "genuinely greenfield" looks like. Not a slightly better version of something that already exists. Something that has never been attempted at all. The moment I realised nobody had done this, I knew the project was worth building.

2. 183 calls sounded like a broken fax machine

The first large batch of calls went out and 183 London cafes received a phone call from what sounded like garbled noise. Not a voice. Not a question. Just electronic screeching. The kind of sound that makes you hang up immediately and wonder who is prank-calling you from 1997.

The cause was painfully simple. Twilio sends audio in mulaw format at 8kHz. ElevenLabs was expecting something different. The two systems were talking to each other in incompatible audio formats, and the result was nonsense. The fix was a single API call to change the encoding. Took about three minutes once I found it.

The lesson took longer. 183 venues got a terrible first impression. Some of them will never pick up a call from that number again. The fix was obvious in hindsight, which is the most expensive kind of obvious. Test one call. Listen to it. Then run a thousand.

3. "Are you pram-friendly?" is a question venues have never been asked

This is the best part of the entire project. Because nobody has ever asked this question systematically, nobody has a scripted answer. What you get instead is genuine, unfiltered honesty. Staff respond in the moment, and their answers reveal far more than a simple yes or no.

Some venues are enthusiastic. Jumi Cheese in Stoke Newington told the AI: "You can definitely get the pram through the door. It's nearly one and a half metres wide." Gloria confirmed they could accommodate prams and had high chairs available. These are places that want families and are happy to say so.

Others get creative. Viktor Wynd Museum explained: "The pram can't go to the museum, but you can leave the pram upstairs and carry the baby downstairs." Hackney Picturehouse described their baby screenings: "We allow people to leave their prams in the foyer area. Cheaper tickets, and the lights don't go fully down." Perilla offered a practical tip: "If you leave a note in the reservation, especially if it's through the week, we will pretty much always have space." None of these responses fit neatly into a yes or no box. All of them are enormously useful.

4. Only 8 out of 2,093 calls detected it was an AI

That is 0.4%. Out of more than two thousand phone calls, eight people recognised they were speaking to a machine. The format is the reason. One question, twenty seconds, done. The call is short enough that most people answer before they have time to wonder who is asking.

There is also something about the question itself that helps. It is so specific, so clearly motivated by a real situation, that it reads as genuine. Nobody expects a robot to phone up and ask about pushchairs. The question carries its own credibility.

The low detection rate matters for data quality. If venues knew they were speaking to an AI, they might give more cautious or corporate answers. The honesty of the responses depends on them feeling like a normal phone conversation, and for 99.6% of calls, that is exactly what it felt like.

5. The amber category is where the real information lives

Green means yes, come on in. Red means no, you cannot bring a pushchair. Those are straightforward. But the most valuable information sits in the space between, the amber category I call "tight squeeze."

The 606 Club described their situation perfectly: "We're a basement venue, so you've got about fifteen steps to get down. Once you're in, it's all on the flat." That is not a yes and it is not a no. It is a maybe, with conditions. For a parent with a lightweight umbrella stroller, that might be fine. For someone with a double buggy, it is a definite no. The same venue can be green for one family and red for another.

This nuance is why a simple database of yes and no would miss the point. The amber venues, the ones with stairs and narrow doorways and creative workarounds, contain the richest data. They are the venues where a sentence of context makes the difference between a great afternoon and a wasted trip across London.

6. Pubs that say "over 18s only" are doing you a favour

The Kenton Pub was direct: "No, we are adult only pub, I'm afraid." Barrio Shoreditch said plainly: "We are an over eighteens only venue." Satan's Whiskers, same story. No ambiguity. No arriving with a pushchair and a toddler only to discover you are not welcome.

These honest refusals are some of the most useful data in the entire set. Every parent has a story about getting turned away at the door, or worse, making it inside only to realise the entire venue is clearly not set up for children. A clear no, given on the phone before you leave the house, saves you the trip, the awkwardness, and the wasted afternoon.

The venues that say no immediately are showing a form of respect. They are being honest about who they serve. That honesty is worth far more than a vague "we can try to make it work" that falls apart when you actually arrive.

7. 3x daily autonomous calling means the map grows while I sleep

The pipeline runs three times a day: 10:30am, 2pm, and 5pm. Each run makes up to 300 calls. After the calls finish, the system fetches the transcripts, classifies each response with Claude, matches the results to venues in the database, deploys the updated map, and sends a notification. No human in the loop at any stage.

The results compound quickly. Two days before writing this, the map had 64 venues. Now it has 231, broken down as 183 green, 21 amber and 27 red. That is 167 new data points in 48 hours, all collected, classified, verified and published without me touching a keyboard.

This is the part that still surprises me. The map is growing while I am at the park with my kid. It is growing while I sleep. Autonomous calling at scale turns a weekend project into a living dataset that gets more useful every day.

8. Venue data quality follows a power law

Hackney has 248 venues in the database. Clapham has 2. The data is wildly uneven, concentrated in the boroughs where open data sources happen to be richest and call pickup rates happen to be highest. Most of the useful information comes from a handful of areas.

The naive approach would be to keep calling everywhere equally, which would pile up data in Hackney and leave Clapham empty. The smarter approach, and the one the pipeline now uses, is priority retry ranking. Boroughs with the fewest data points get called first. Underserved areas move to the top of the queue. The goal is coverage, not depth.

This is a general principle for any data collection project at scale. The first 80% of value comes from 20% of the effort. The last 20% of coverage, the Claphams and the Bexleys, requires disproportionate attention. You have to deliberately route resources away from where data is easy and towards where it is scarce.

9. The civic data angle is bigger than the parenting angle

I started building Buggy Smart because I am a parent who wanted to know which cafes my pushchair could get into. But the more data I collected, the more obvious it became that this is not just a parenting convenience tool. It is civic infrastructure that should exist but does not.

Local councils in the UK have legal obligations around accessibility. The Equality Act covers wheelchair access. Planning requirements cover step-free entry. But nobody collects systematic, venue-level data on physical accessibility for pushchairs, wheelchairs, or mobility aids. The data simply does not exist at borough level.

Borough accessibility reports could be worth somewhere between 500 and 2,000 pounds per borough. The parenting audience gets you users and word of mouth. The civic angle gets you institutions, council contracts, and a reason for the data to be maintained long after the initial novelty fades. The consumer product is the front door. The civic data layer is the business.

10. Test one call before you run a thousand

This is the lesson that runs through the entire project, and it deserves its own section because I learned it more than once.

The 183 garbled calls from the audio format mismatch. The duplicate calls that went out because deduplication was not in place. The prompt that drifted from saying "quick question" to "silly question" and was not caught until hundreds of calls had already gone out. Every single one of these mistakes was amplified by scale. And every single fix was obvious in hindsight.

The total cost for Phase 1 was roughly 65 pounds. That covers Twilio call minutes, ElevenLabs voice generation, and Claude API calls for transcript classification across all 2,093 calls. It is remarkably cheap to make two thousand phone calls with an AI. Which means it is also remarkably cheap to make two thousand mistakes. The discipline is not in the building. It is in the testing. Run one call. Listen to the whole thing. Check the transcript. Verify the classification. Then, and only then, run a thousand.

The numbers

2,093 total calls. 1,528 unique venues across 12 London boroughs. 231 venues now on the live map: 183 green, 21 amber, 27 red. 8 AI detections out of 2,093 calls, a 0.4% detection rate. Around 12% of calls returned a useful answer. 183 calls lost to the audio encoding bug. Three calling runs per day. Total cost for Phase 1: approximately 65 pounds.

The map is live at buggysmart.app. Made by a London dad who got tired of guessing. Updated daily.