- Sedky Haider
As a long time frequent traveler, I frequently ran into trouble keeping organized with all my flights, hotels, Airbnbs, car reservations, etc.
So - my partner and I decided if we're running into these issues, others were as well. So we set out to build a product to help us automatically track these changes with little manual effort from our part.
Thus, Agendo was born. Well, not yet launched as of today :p
What's Agendo? It's a travel wallet that's integrated with your email. It looks for reservation emails and then adds them into your wallet, also giving you notifications of flight delays, check in confirmations, allows your friends and families to stay connected with your travels, and more.
The stack, I've been using nearly the same one for years, which is a mixture of speed, DevEx, and price. Read here about building an MVP.
The syncing flow is quite straight forward. We simply use OAuth exchange, prompting the user for the correct Oauth scopes. We save a refresh token and access token so we can set up syncs with MS/Google (these expire weekly).
This is the most high friction part of user experience - after all - Big Tech has taught us to be wary of giving our data to just anyone. We take data seriously, and never save or present information in your emails asides from the absolute minimum which is showed to the user about their upcoming reservations.
New Email Flow
When synced users receive a new email, Google/MS will send a Webhook to our server.
This is the core logic - where we perform filtering and normalizing.
The "normalize" part of the transaction of the hardest. This is because there are thousands of airlines all over the world, and they all have a completely different email structure.
This makes it so a combination of techniques are the most effective ways to distill all the plain text into some email.
Google and Microsoft also support many different mediums of over-the-wire structures, from Base64, plain text, HTML, mimetypes, and more.
For this reason - we decided to leverage the power of LLMs to convert blocks of plain text into a JSON.
Prompt-Driven Engineering to Schema-Driven Engineering
Of course at first, we focussed on prompt-driven engineering. So we specified a specific prompt, which also included a JSON type we wanted the response structure to come back in.
Using ChatGPT 3.5 and 4, we have a mixture of results. It took LOTS of fine tuning the prompt as well as the JSON type (had to use bools and flat datastructures).
However, the failure rate was quite high. We hovered at around 87% failure rate. Meaning, for every 100 flights, 13 flights would have bad data, whether it's the date, flight time, etc.
For a lot of use cases, 87% is good enough. For flight departures, an already stressful period, you need as little as possible to go wrong. Nothing less than 100% was acceptable.
Evolving the data structure was a giant pain because minor schema modifications and slightly unique flight email design permutations meant new adjustments that satisfied the entire test suite.
Fortunately, I stumbled on an article on HN about Typechat and tried using it. It was surprisingly effective. We were able to ditch the entire prompt and use only the data model. Surprisingly, the failure rate stayed roughly the same (increased actually slightly).
Breaking down the problem
To get closer to 100%, we needed the LLM to do less. We were asking too much of it. Maybe for a future LLM, but as of today, piping a few thousand tokens worth of plain text, which included a lot of garbage filler/promotional material from airlines was a high burden.
Even after cleaning it slightly.
So the answer was of course was to ask it for less output. Flights are documented extremely accurately. There is a satured market dedicated to providing details online, that we were burdening the LLM with.
The LLM does not need to provide the entire flight itinerary. If the LLM can simply provide the departure & arrival city & dates, and the flight number, we could fetch everything else from Flight APIs.
This approach minimizes the reliance we have on the LLM.
Here is the core business logic:
And the rest is pretty straight forward stuff that's been covered by me or extensively elsewhere
- notifications / alerts / delays
- user/friend social system