← Back to HAQQ Blog

We Tested a Generic AI Against HAQQ on Real Startup Legal Documents. Here's What Happened.

By Stephane Boghossian · · 14 min read · Ai-legal-tech

A real experiment with a real client: Claude (generic LLM) vs HAQQ (domain-specific legal AI) drafting co-founder agreements and IP assignments. 5 rounds, 13 issues found, 32 pages produced. The results weren't even close.

The Setup

A founder came to us with a concrete need. They're co-founding an AI startup with a technical partner who built the entire codebase. The business founder handles strategy, growth, and partnerships. They needed two documents before they could move forward:

These aren't hypothetical. The founders are actively pursuing multiple paths: an acquisition listing, open-source launch, and a pre-seed raise. The documents need to be real.

We proposed an experiment: What if we had a generic LLM draft the documents first, then had HAQQ review and revise them? The founder agreed. Here's what happened across 5 rounds.

Round 1: Claude (Generic LLM)

The founder gave Claude Opus detailed context about the startup — the tech stack, the equity split, their open-source commitment, German jurisdiction, the three strategic paths — and asked it to draft both agreements.

What came back: A ~10-page document covering the basics. IP assignment, equity split, vesting schedule, open-source clause, German arbitration. It looked like a legal document. It used legal language. It had section numbers.

A solid first draft from a smart intern who's read a few term sheets.

For a generic AI with no legal training data, no jurisdiction-specific knowledge, and no understanding of startup mechanics, it was impressive. Two years ago, getting this output from any AI would have been headline news.

But the founder wasn't looking for impressive. They were looking for signable.

Round 2: HAQQ Reviews and Revises

We took Claude's draft and fed it into chat.haqq.ai.

HAQQ's first response wasn't a revised document. It was a 13-point critique.

What HAQQ Found Wrong with the Generic AI Draft

Every single one of these is the kind of thing a startup lawyer would catch in the first read. None of them are exotic. They're table stakes for a real founder agreement.

The Revised Version

HAQQ then produced a 27-page revision across two interlocking agreements with schedules.

Agreement 1: IP Assignment (14 sections + 2 schedules)

Agreement 2: Co-Founder Agreement (17 sections)

Round 3: Claude Reviews HAQQ's Work

We then brought HAQQ's 27-page revision back to Claude for a neutral review. Would a generic AI recognize the improvements? Or would it think its own draft was fine?

Claude's assessment was unambiguous:

HAQQ's revision is a major upgrade. It transforms what was a reasonable AI-generated template into something approaching sign-ready. Grade: B+ to A-.

Credit where it's due — Claude was honest. It identified 7 remaining refinements:

Good suggestions. But here's where it got interesting.

Round 4: HAQQ Reviews Claude's Review

We fed Claude's 7-point feedback back into HAQQ. Would HAQQ agree? Push back? Find things Claude missed?

HAQQ's response: "Claude's feedback is 85-90% aligned with what we would recommend."

HAQQ agreed with 6 of 7 points. On compensation, HAQQ partly disagreed — recommending a separate side letter rather than baking salary and sale economics into the co-founder agreement. Smart separation of concerns.

But then HAQQ went further. It flagged 5 Germany-specific issues Claude completely missed:

This is the moment the experiment got real. Claude's feedback was solid in the abstract. But it was reviewing as if this were a Delaware startup with common-law mechanics. HAQQ knew the startup is incorporating in Germany and applied the right legal framework.

Round 5: HAQQ Implements Everything

This is where HAQQ proved it's not just a critic. We fed back Claude's 7 suggestions plus HAQQ's own 5 Germany-specific findings, and asked HAQQ to produce Version 2.

What came back: 32 pages. Two agreements. Three schedules.

HAQQ didn't just apply the feedback. It went beyond it.

Things nobody asked for (but HAQQ added anyway)

The Evolution: 10 Pages to 32 Pages in 5 Rounds

From a basic template to a 32-page, 3-schedule, German-law-aware, open-source-friendly, multi-path-exit-ready founder document package. Built entirely by AI. Argued by two different AI systems with different strengths.

Why This Matters

This experiment revealed something important: the best results come from making AIs argue with each other.

Claude brought breadth — wide knowledge, honest self-assessment, good structural suggestions. HAQQ brought depth — jurisdiction-specific precision, document architecture, German-law compliance.

Neither alone produced the ideal document. Together, through 5 rounds of iterative review, they produced something a German startup lawyer can finalize in a few hours.

The gap comes from domain knowledge:

The Bottom Line

After 5 rounds of a generic LLM and a legal AI arguing over real founder documents, our client has a 32-page package that's been stress-tested from both sides. Claude brought the breadth. HAQQ brought the depth. Each one caught things the other missed.

That's why we built HAQQ. Not to replace general-purpose AI, but to be the specialist that makes the generalist's output actually signable.

This experiment was conducted in April 2026 using Claude Opus 4.6 and chat.haqq.ai. The client's identity and startup details have been anonymized. No lawyers were harmed — but one will be hired to finalize the documents.

الإعداد

جاءنا مؤسس باحتياج محدد. يؤسس شركة ذكاء اصطناعي ناشئة مع شريك تقني بنى قاعدة الكود بالكامل. المؤسس التجاري يتولى الاستراتيجية والنمو والشراكات. احتاجوا وثيقتين قبل المضي قدماً:

اقترحنا تجربة: ماذا لو صاغ ذكاء اصطناعي عام الوثائق أولاً، ثم راجعها HAQQ وعدّلها؟ وافق المؤسس. إليك ما حدث عبر 5 جولات.

الجولة الأولى: Claude (ذكاء اصطناعي عام)

أعطى المؤسس لـ Claude Opus سياقاً مفصلاً عن الشركة الناشئة وطلب صياغة الاتفاقيتين. ما عاد: وثيقة من ~10 صفحات تغطي الأساسيات.

مسودة أولى جيدة من متدرب ذكي قرأ بعض أوراق الشروط.

لكن المؤسس لم يكن يبحث عن شيء مثير للإعجاب. كان يبحث عن شيء قابل للتوقيع.

الجولة الثانية: HAQQ يراجع ويعدّل

أخذنا مسودة Claude وأدخلناها في chat.haqq.ai. استجابة HAQQ الأولى لم تكن وثيقة معدّلة. كانت نقداً من 13 نقطة.

كل واحدة من هذه المشكلات هي من النوع الذي يكتشفه محامي شركات ناشئة في القراءة الأولى. ليست غريبة. هي الحد الأدنى لاتفاقية مؤسسين حقيقية.

ثم أنتج HAQQ مراجعة من 27 صفحة عبر اتفاقيتين متشابكتين مع جداول.

الجولة الثالثة: Claude يراجع عمل HAQQ

أعدنا مراجعة HAQQ المكونة من 27 صفحة إلى Claude لمراجعة محايدة.

مراجعة HAQQ ترقية كبيرة. تحوّل قالباً معقولاً من الذكاء الاصطناعي إلى شيء يقترب من الجاهزية للتوقيع. التقييم: B+ إلى A-.

حدد Claude 7 تحسينات متبقية. اقتراحات جيدة. لكن هنا أصبح الأمر مثيراً.

الجولة الرابعة: HAQQ يراجع مراجعة Claude

أدخلنا ملاحظات Claude السبع في HAQQ. وافق HAQQ على 6 من 7 نقاط. لكن ذهب أبعد وأشار إلى 5 مشكلات خاصة بالقانون الألماني لم يلاحظها Claude إطلاقاً:

الجولة الخامسة: HAQQ ينفذ كل شيء

هنا أثبت HAQQ أنه ليس مجرد ناقد. أدخلنا اقتراحات Claude السبع بالإضافة إلى نتائج HAQQ الخمس الخاصة بألمانيا. ما عاد: 32 صفحة. اتفاقيتان. ثلاثة جداول.

التطور: من 10 صفحات إلى 32 صفحة في 5 جولات

من قالب أساسي إلى حزمة من 32 صفحة، 3 جداول، متوافقة مع القانون الألماني، صديقة للمصادر المفتوحة، جاهزة لمسارات خروج متعددة. بُنيت بالكامل بالذكاء الاصطناعي. تناقش فيها نظامان مختلفان بنقاط قوة مختلفة.

لماذا هذا مهم

كشفت هذه التجربة شيئاً مهماً: أفضل النتائج تأتي من جعل أنظمة الذكاء الاصطناعي تتجادل مع بعضها البعض.

Claude جلب الاتساع. HAQQ جلب العمق. كلاهما لم ينتج الوثيقة المثالية وحده. معاً، عبر 5 جولات من المراجعة التكرارية، أنتجوا شيئاً يمكن لمحامي شركات ناشئة ألماني إنهاءه في ساعات قليلة.