Skip to content

LessUp/meta-human

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MetaHuman Engine

MetaHuman Engine

Give AI a real-time interactive digital body

Browser-native 3D digital human engine with voice, vision, and dialogue capabilities.
Zero-config · Offline-ready · Production-grade

Quick Start · Features · Architecture · Docs · 中文

React Three.js TypeScript Vite License


Quick Start

# Clone and install
git clone https://github.com/LessUp/meta-human.git
cd meta-human
npm install

# Start development server
npm run dev

Open http://localhost:5173 — your 3D avatar is ready!

No API key required. The engine automatically falls back to local mock mode for out-of-the-box demos.


Features

🎭 3D Avatar Engine

Feature Description
GLB/GLTF Support Load custom models or use built-in procedural avatar
Emotion-Driven Happy, surprised, sad, angry moods map to expressions
Skeletal Animation Wave, greet, nod, dance triggered by dialogue
Adaptive Performance 60fps rendering with device-based quality scaling
import { digitalHumanEngine } from '@core/avatar';

digitalHumanEngine.perform({
  emotion: 'happy',
  expression: 'smile',
  animation: 'wave'
});

🗣️ Voice Interaction

Feature Description
TTS Edge TTS for natural voice synthesis
ASR Browser-native speech recognition
Smart Muting Auto-pause TTS when user speaks
Voice Detection Visual feedback during recording
import { ttsService, asrService } from '@core/audio';

await ttsService.speak('Hello! How can I help?');

asrService.start({
  onResult: (text) => dialogueService.send(text)
});

🧠 Intelligent Dialogue

Feature Description
Multi-Modal Response Returns { replyText, emotion, action }
Streaming Real-time token-by-token via SSE
Graceful Degradation Falls back to local mock when API unavailable
Session Management Persistent conversation context
import { dialogueService } from '@core/dialogue';

const response = await dialogueService.send({
  text: 'Tell me a joke',
  sessionId: 'user-123'
});
// → { replyText: '...', emotion: 'happy', action: 'laugh' }

👁️ Visual Perception

Feature Description
Face Mesh 468 landmarks for micro-expression detection
Pose Estimation Upper body gesture recognition
Emotion Mapping Real-time emotion inference
Privacy First All processing in browser, no data leaves client

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                          UI Layer                                │
│   ChatDock · TopHUD · ControlPanel · SettingsDrawer             │
└─────────────────────────────────────────────────────────────────┘
                                │
┌─────────────────────────────────────────────────────────────────┐
│                       Core Engine Layer                          │
│   Avatar · Dialogue · Vision · Audio · Performance              │
└─────────────────────────────────────────────────────────────────┘
                                │
┌─────────────────────────────────────────────────────────────────┐
│                       State Layer                                │
│   chatSessionStore · systemStore · digitalHumanStore            │
└─────────────────────────────────────────────────────────────────┘
                                │
┌─────────────────────────────────────────────────────────────────┐
│                      External Services                           │
│   Three.js · Web Speech API · MediaPipe · OpenAI API            │
└─────────────────────────────────────────────────────────────────┘

State Management

Three focused domains minimize re-renders:

Store Responsibility
chatSessionStore Message history, session lifecycle
systemStore Connection status, errors, performance metrics
digitalHumanStore Avatar runtime state (expression, animation, audio)

Project Structure

src/
├── core/                          # Engine modules
│   ├── avatar/                    # 3D rendering & animation
│   │   ├── DigitalHumanEngine.ts  # Unified driver
│   │   └── constants.ts           # Expressions, animations
│   ├── audio/                     # TTS & ASR services
│   ├── dialogue/                  # Chat transport & orchestration
│   │   ├── dialogueService.ts     # API client
│   │   ├── dialogueOrchestrator.ts
│   │   └── chatTransport.ts       # HTTP/SSE/WebSocket
│   ├── vision/                    # MediaPipe pipeline
│   │   ├── visionService.ts
│   │   └── visionMapper.ts
│   └── performance/               # Device capability detection
├── components/                    # React components
│   ├── DigitalHumanViewer.tsx     # 3D viewport
│   ├── ChatDock.tsx               # Chat interface
│   ├── TopHUD.tsx                 # Status bar
│   ├── ControlPanel.tsx           # Quick controls
│   ├── VoiceInteractionPanel.tsx
│   ├── VisionMirrorPanel.tsx
│   └── ui/                        # Shared primitives
├── store/                         # Zustand stores
│   ├── chatSessionStore.ts
│   ├── systemStore.ts
│   └── digitalHumanStore.ts
├── hooks/                         # Custom hooks
├── pages/                         # Route pages
└── lib/                           # Utilities

Deployment

GitHub Pages (Frontend)

npm run build:pages
  1. Set VITE_API_BASE_URL in GitHub Repository Variables
  2. Push to master or run "Deploy Pages" workflow
  3. Live at: https://lessup.github.io/meta-human/

Render (Backend)

Use render.yaml blueprint:

# Deploys FastAPI backend with:
POST /v1/chat          # OpenAI-compatible chat
POST /v1/chat/stream   # SSE streaming
POST /v1/tts           # Text-to-speech
POST /v1/asr           # Speech-to-text
WebSocket /ws          # Real-time streaming

Required Environment Variables:

Variable Description
OPENAI_API_KEY AI responses (optional, falls back to mock)
CORS_ALLOW_ORIGINS Frontend domain for CORS

Scripts

# Development
npm run dev              # Start dev server (port 5173)
npm run preview          # Preview production build

# Build
npm run build            # Production build
npm run build:pages      # GitHub Pages build

# Quality
npm run lint             # ESLint check
npm run lint:fix         # Auto-fix ESLint issues
npm run format           # Prettier formatting
npm run typecheck        # TypeScript check

# Testing
npm run test             # Vitest watch mode
npm run test:run         # Run tests once
npm run test:coverage    # Coverage report

Browser Support

Chrome Edge Firefox Safari
90+ ✅ 90+ ✅ 90+ ✅ 15+ ✅

Speech Recognition (ASR) requires Chrome or Edge due to Web Speech API limitations.


Contributing

  1. Fork the repository
  2. Create feature branch: git checkout -b feat/amazing-feature
  3. Commit changes: git commit -m 'feat: add amazing feature'
  4. Push: git push origin feat/amazing-feature
  5. Open a Pull Request

Follow Conventional Commits.


License

MIT © LessUp


Built with ❤️ to make digital humans accessible to everyone

About

A digital human that talks, expresses, and perceives. Clone and run — no cloud service or API key required. | 一个能对话、会表达、有感知的数字人。克隆即用,无需云服务和 API Key。

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors