ai10 phút đọc

Xây dựng AI Chatbot cho Doanh nghiệp: Kiến trúc và Best Practices

Thiết kế AI chatbot production-grade — intent classification, RAG, conversation memory, escalation flow, và evaluation metrics.

Bởi Ventra Rocket

·Đăng ngày 28 tháng 2, 2026

#AI Chatbot#RAG#NLP#Enterprise AI#LangChain

AI chatbot doanh nghiệp cần intent routing, knowledge grounding, conversation memory, và escalation logic — không chỉ là một API call đến LLM.

Kiến trúc

Tin nhắn người dùng → Input Guard → Intent Classifier
    ↓
FAQ Handler | RAG Handler | Task Handler
    ↓
Response Generator (LLM)
    ↓
Output Guard → Escalation Check → Human Agent (nếu confidence < 0.7)

1. Phân loại Intent

INTENT_SYSTEM = """Phân loại tin nhắn vào một trong các intent:
- faq: câu hỏi chung về sản phẩm/dịch vụ
- order_status: tra cứu đơn hàng
- technical_support: vấn đề kỹ thuật
- complaint: khiếu nại
- escalate: yêu cầu gặp nhân viên

Trả về JSON: {"intent": "...", "confidence": 0.0-1.0}"""

def classify_intent(message: str) -> dict:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": INTENT_SYSTEM},
            {"role": "user", "content": message},
        ],
        response_format={"type": "json_object"},
        temperature=0,
    )
    return eval(response.choices[0].message.content)

2. RAG-Grounded Responses

def retrieve_context(query: str, collection: str) -> list[str]:
    embedding = get_embedding(query)
    results = qdrant.search(
        collection_name=collection,
        query_vector=embedding,
        limit=4,
        score_threshold=0.75,
    )
    return [r.payload["text"] for r in results]

3. Conversation Memory với Redis

def save_message(session_id: str, role: str, content: str) -> None:
    msg = json.dumps({"role": role, "content": content})
    redis_client.rpush(f"chat:{session_id}", msg)
    redis_client.expire(f"chat:{session_id}", 3600)  # TTL 1 giờ

4. Escalation Logic

ESCALATION_TRIGGERS = ["gặp nhân viên", "muốn nói chuyện người thật", "bức xúc"]

def should_escalate(message: str, intent: dict, failures: int) -> bool:
    if any(t in message.lower() for t in ESCALATION_TRIGGERS):
        return True
    if intent["confidence"] < 0.5 and failures >= 2:
        return True
    if intent["intent"] == "complaint":
        return True
    return False

5. Metrics Đánh giá

| Metric | Mục tiêu | |--------|----------| | Intent accuracy | > 90% | | Tỷ lệ tự giải quyết | > 75% | | CSAT | > 4.0/5 | | Số lượt hội thoại để giải quyết | < 4 | | Tỷ lệ escalate | < 20% |

Kết luận

Ventra Rocket đã xây dựng chatbot xử lý hàng nghìn cuộc hội thoại/ngày, đạt 80%+ tỷ lệ tự giải quyết.

Bài viết liên quan

Claude Code + Cursor: Startup 2 Người Ra Mắt SaaS Trong 30 Ngày

Hai nhà sáng lập Việt Nam không có nền tảng kỹ thuật xây dựng một SaaS quản lý phòng khám nha khoa đầy đủ — đặt lịch, hồ sơ bệnh nhân, hóa đơn, nhắc hẹn SMS — trong 30 ngày dùng Claude Code và Cursor. 15 phòng khám trả phí trong tháng đầu. Gọi vốn pre-seed dựa trên traction.

Claude CodeCursorAI

5 tháng 5, 2026·10 phút đọc

Đọc thêm →

Tạo Video AI Theo Quy Mô Lớn: Giúp Agency Marketing Sản Xuất 200 Video/Tháng

Một agency digital marketing Việt Nam phục vụ 30+ thương hiệu e-commerce đã giảm chi phí sản xuất video từ $800 xuống còn $35 mỗi video và mở rộng lên 200+ video/tháng bằng pipeline AI xây dựng trên Claude, ElevenLabs, Runway Gen-3 và FFmpeg.

Video AIAIEnterprise

28 tháng 4, 2026·11 phút đọc

Đọc thêm →

Gemini Cho Doanh Nghiệp: Xây Dựng Knowledge Base Đa Phương Thức Cho Mạng Lưới Bệnh Viện

Một tập đoàn bệnh viện tư tại Việt Nam với 12 cơ sở đã thống nhất 50,000+ hồ sơ y tế — PDF, ghi chép tay, X-quang, kết quả xét nghiệm — vào một hệ thống tìm kiếm AI duy nhất dùng Gemini 1.5 Pro. Thời gian tra cứu chẩn đoán giảm từ 15 phút xuống còn 30 giây.

GeminiAIEnterprise

21 tháng 4, 2026·11 phút đọc

Đọc thêm →