SAVE FILE LOADED  ·  LEVEL 7  ·  CLASS: DATA MAGE

Call me princess cruises the
way I ship

I'm a full-stack engineer who speed ran the Dunning-Kruger effect. Not saying I'm Linus or anything, but I know that I don't know a lot and that should count for something I think. I'm a quick learner which is great because I love learning new things. I ship often and get fired up building things that positively effect the world.

or scroll to read

ROOM 02/ origin

You sly dog, you got me monologuing!

I was born in a third world, backwater city of San Diego. I had a chip on my shoulder because of it. I have always had the itch for entrepreneurship and loved the idea of building things of my own. As a young warthog I would proudly sell store bought lemonade (at a markup) and run my own garage sales where I'd sell old Pokemon and Yu Gi Oh cards (that are probably worth more now than they were then). After I graduated clown school (read "Bachelor of Arts, Economics") I worked as a finance bro because I felt like it fit my gym loving, fratty aesthetic I had GRINDED to build up. Unfortunately, being paid in Zyn coupons and plastic handles of Citron for my first analyst job truly didn't do it for me. BUT, alas dear reader, it did teach me something. There might be something to this whole data analysis thing, and even more than that, learning some coding would probably be good to automate some of the bs I had to do in excel.

From here I went on to dive head first into the deep end. After all, risk management is for actuaries and people who never dare to risk it all at the mere idea that they could change their lives forever. If you felt like that sounded like something out of r/Im14andthisisdeep then ow, but also fair point. ANYWAY, I went on to deep dive into learning SQL which gave me my lucky break working for DSHS in Texas during COVID, a time when they were sorely lacking data analysts and NEEDED data analysts. I got to do things I never thought while I was there, like present my research to the Governor of Texas.

I then dove deeper from there, learning Python, TS, JS, and Solidity and worked in the crypto industry for several years as a Full-Stack engineer primarily writing and testing smart contracts. I also had the opportunity to lead teams there too. If I decide to start a blog, I'll share more there. I know I said that this was my monologue time but I think i'm getting early onset fibromyalgia (I just google searched that word and discovered I spelled that correctly first try!). I'll finish this section by mentioning in the last few years I've found a good niche for myself, working at the intersection of AI and big data, still full-stack but focused on building data pipelines and shaving milliseconds off data queries.

You made it this far, dear reader. As a thanks or maybe a you're welcome, a final promise from me to you. I know that sometimes the world is challenging, and that things seem uncertain at times. Whether or not our paths cross, I want you to know that if there is anything that I know for certain (read the beginning part about my relationship with Dunning-Kruger) it's that I am here to leave the world better than I found it. Whatever it takes, I'm going to make it happen. For all of us.

ROOM 03/ work

Ship i've shipped

Architecture, decisions, tradeoffs. Idk, nothing quippy to say about this section. Hopefully if a recruiter reads this they'll let me skip the multi-hour techincal interviews that are honestly, designed to test how well you can memorize facts and handle pressure, not do the job you're paid to do.

[01]

ScriptBuddyAI

scriptbuddyai.com

Voice memos in → platform-shaped scripts out. Streaming, multi-format, opinionated.

PROBLEM

Anyone who talks for a living has hours of voice notes and zero time to reshape them. Off-the-shelf Whisper gives you transcripts. Nobody publishes transcripts.

DECISIONS

  • OpenAI Whisper over self-hosted: faster TTM, predictable cost at this volume.
  • Vercel Edge for the streaming endpoint — SSE all the way to the browser.
  • Prompt-template chain instead of fine-tuning: cheaper to iterate, easier to audit.
  • Supabase RLS so user audio is isolated without writing an auth service.

ARCHITECTURE

browser
   upload .m4a (chunked, resumable)
  
Vercel Edge fn ── signed URL ──> S3
  
  
Whisper API ── transcript ──> prompt chain
                              
                  ┌───────────┼───────────┐
                                        
              X thread    LinkedIn    YouTube
              (SSE)       (SSE)       script
                  
              Supabase  ─ RLS per user ─ history

OUTCOME

  • Public product live with auth, pricing, and script generation flows.
  • Typical 5-minute memo to multi-format drafts: under ~30 seconds in normal runs.
  • Generation cost kept to low cents per script at current usage patterns.
STACK
Next.jsTypeScriptOpenAITailwindVercel EdgeSupabaseS3
[02]

AI Financial Analyst

aifa.money

Retail investors get the why, not just the heatmap. Live tickers, on-demand reasoning.

PROBLEM

Consumer finance apps tell you what moved. They rarely tell you why, and the explanations they do give are either too shallow or paywalled behind a Bloomberg terminal.

DECISIONS

  • FastAPI for the ingest tier — async websockets, typed contracts, fast iteration.
  • TF-Serving for inference: models are pre-trained, inference-only; no PyTorch needed.
  • Supabase for auth + a Postgres feature store; saved me a quarter of plumbing.
  • Server-side rate-limiting on the LLM explainer — cost is the real adversary.

ARCHITECTURE

Polygon WS  ─┐
                
Tiingo REST ─┼─> FastAPI ingest ─> Postgres
                                          (feature store, RLS)
                                                
        TF-Serving ─<─────────────────────┘
                
                
        LLM explainer ─ retrieves features ─> "why it moved"
                
                
        React ── WS push ── live tickers + cards

OUTCOME

  • 1k+ liquid US tickers supported in the research/watchlist universe during testing.
  • Sub-second model inference for cached feature lookups; LLM explanations return in seconds.
  • Dev/demo infra kept under ~$100/month before paid market-data costs.
STACK
ReactPythonFastAPITensorFlowSupabasePostgresPolygon
[03]

Data Acquisition Pipeline

Went full manifest destiny on public and synthetic data sources.

PROBLEM

Brilliant AI researchers built a cool model. It needed tens of millions of pieces of data to train on.

ROLE

  • Built end-to-end ETL pipelines to tactically acquire public data, generate synthetic data, and transform real and synthetic data for model use.
  • Used Modal workers and S3 handoffs so acquisition jobs could scale without becoming one giant long-running service.
  • Wired CloudWatch visibility around runs because silent data failures are how models get haunted.

ARCHITECTURE

              Scheduler
                  
                  
        Modal worker fleet ── rotating proxies ──> public sources
                  
   synthetic gen ─
                  
        Raw S3 partitions ── normalize + dedupe ──
                                                  
                                                  
                                       Training-ready S3
                                                  
                                                  
                                       CloudWatch ── logs + alerts

OUTCOME

  • 1M+ rows/day acquired and transformed.
  • Checkpointed workers made failed runs safe to resume.
  • Raw and normalized datasets were separated for auditability.
STACK
PythonModalS3CloudWatchTerraformRotating proxies
ROOM 04/ craft

Stacks on Stacks on Stacks

TOOLS. Sometimes maybe good sometimes maybe shit.

INGEST

Pull it, push it, or drop it in a bucket — the data has to get in somehow.

REST APIsWebhooksBatch jobsS3/GCS/Blob

ORCHESTRATE

Anything that shouldn't block a user request lives here. Cron for the trivial; Trigger.dev or Inngest when retries and timeouts deserve real care; Modal when I'd rather not babysit a server.

CronGitHub ActionsCloud schedulersTrigger.devInngestModal

TRANSFORM

Pandas and NumPy when the transformation lives next to the product or the analysis. SQL when it belongs in the database. Pydantic turns messy external payloads into typed contracts; BeautifulSoup and regex handle the extraction half.

PandasNumPySQLSQLAlchemyPydanticJupyterBeautifulSoupregex

STORE

Postgres is the source of truth until it isn't. Redis for things that should be fast and forgettable, object storage for things you might want a year from now, OpenSearch when search is the actual job.

PostgresPrismaSupabaseRedisS3/GCSOpenSearchSnowflakeBigQueryRedshift

SERVE

FastAPI when the shape of the data matters. Next.js when the shape of the page matters. REST as the default, GraphQL when the client genuinely needs to ask for half the world, and a queue any time the answer can show up later.

FastAPINext.jsRESTGraphQLServerlessQueues

OBSERVE

Logs you can grep, metrics you can graph, alerts you can mostly ignore. OTel, Grafana, and Datadog when the stack yearns for them.

CloudWatchStructured logsMetricsAlertsOpenTelemetryGrafanaDatadog
ROOM 05/ now

Interests in and outside of work

QUEST AVAILABLE

Open to project manager & software / data engineering roles.

Currently: Lead Software Engineer (Full-Stack). Remote, US.

  • Bridging the gap between techincal & non-technical teams.
  • Building shit that makes people's lives BETTER. The more people it helps, the better.
  • Creating and executing on big ideas, bringing in bright folks around me and working together to get it done, together.
  • Work culture that puts the team first. This does mean what you think it means.
  • Interesting, challenging problems that need to be solved.
ROOM 06/ contact

Send a raven.

~/prefontaine.dev — contact.sh
$ whoami phoenix · lead data engineer $ cat contact.json { "email": "hirepre@proton.me", "github": "github.com/phoenixpre", "linkedin": "linkedin.com/in/phoenix-prefontaine" } $ grep best_for inbox.log # "interesting data, hard tradeoffs, and a team that ships." # I read everything. I reply to most things within 48 hours. $ _