GLM-4.7 - Installation and Configuration for Low-End PCs
This repository contains automated scripts to install and configure the GLM-4.7 model on machines with limited resources.
📋 Minimum Requirements
Minimum Hardware (CPU Only)
- RAM: 32GB+ (64GB+ recommended)
- Disk: 200GB+ free space (SSD recommended)
- CPU: Modern multi-core processor
Recommended Hardware (with GPU)
- GPU: 8GB+ VRAM (16GB+ recommended)
- RAM: 64GB+ (128GB+ recommended)
- Disk: 300GB+ free space (NVMe SSD recommended)
- CUDA: Compatible with CUDA 11.8+ or 12.1+
🎯 Available Model Versions
GLM-4.7 is available in various quantizations for different hardware capabilities:
| Version | Size | Minimum RAM | Minimum VRAM | Recommended Use |
|---|---|---|---|---|
| UD-Q2_K_XL (2-bit) | ~135GB | 128GB | 24GB | Powerful machines with GPU |
| Q4_K_M (4-bit) | ~200GB | 64GB | 16GB | Moderate machines |
| Q4_K_S (4-bit) | ~180GB | 48GB | 12GB | Modest machines |
| Q5_K_M (5-bit) | ~240GB | 80GB | 20GB | Best quality |
🚀 Quick Start
Windows (PowerShell)
# 1. Install dependencies
.\scripts\install.ps1
# 2. Download model (choose appropriate version)
.\scripts\download-model.ps1 -Version "Q4_K_S"
# 3. Run the model
.\scripts\run-llamacpp.ps1
Linux/Mac (Bash)
# 1. Install dependencies
chmod +x scripts/install.sh
./scripts/install.sh
# 2. Download model (choose appropriate version)
chmod +x scripts/download-model.sh
./scripts/download-model.sh Q4_K_S
# 3. Run the model
chmod +x scripts/run-llamacpp.sh
./scripts/run-llamacpp.sh
📁 Repository Structure
.
├── README.md # This file
├── scripts/
│ ├── install.sh # Linux/Mac installation
│ ├── install.ps1 # Windows installation
│ ├── download-model.sh # Model download (Linux/Mac)
│ ├── download-model.ps1 # Model download (Windows)
│ ├── run-llamacpp.sh # Run with llama.cpp (Linux/Mac)
│ ├── run-llamacpp.ps1 # Run with llama.cpp (Windows)
│ ├── run-ollama.sh # Run with Ollama (Linux/Mac)
│ └── run-ollama.ps1 # Run with Ollama (Windows)
├── config/
│ ├── hardware-config.yaml # Hardware configuration
│ └── model-config.json # Model settings
└── models/ # Directory for downloaded models
⚙️ Configuration
1. Configure Hardware
Edit config/hardware-config.yaml with your machine specifications:
hardware:
gpu:
available: true
vram_gb: 8
cuda_arch: "75" # For RTX 2060, 2070, 2080
ram_gb: 32
cpu_cores: 8
disk_space_gb: 500
2. Choose Model Version
Based on your hardware, choose the appropriate version:
- Very low-end PC (32GB RAM, no GPU): Use
Q4_K_Sor consider smaller models - Moderate PC (64GB RAM, 8-16GB GPU): Use
Q4_K_M - Powerful PC (128GB+ RAM, 24GB+ GPU): Use
UD-Q2_K_XLorQ5_K_M
🔧 Execution Methods
Option 1: llama.cpp (Recommended for limited hardware)
llama.cpp offers better control over CPU/GPU offloading and quantization.
Advantages:
- Intelligent offloading support
- Lower memory usage
- Better for limited hardware
Option 2: Ollama (Simpler)
Ollama is easier to use, but may be less efficient on limited hardware.
Advantages:
- Simpler installation
- More user-friendly interface
- Automatic model management
📝 Usage Examples
Run with small context (saves memory)
./scripts/run-llamacpp.sh --ctx-size 4096 --threads 4
Run CPU-only
./scripts/run-llamacpp.sh --cpu-only
Run with partial CPU offloading
./scripts/run-llamacpp.sh --gpu-layers 10
🐛 Troubleshooting
Error: "Out of memory"
- Reduce
--ctx-size(context size) - Use a more quantized version of the model
- Reduce
--gpu-layersto offload more to CPU
Error: "CUDA not found"
- Check if CUDA is installed:
nvidia-smi - Recompile llama.cpp with CUDA support
Model too slow
- Increase
--threads(number of CPU threads) - Use more GPU layers if you have available VRAM
- Consider using a lighter version of the model
📚 Additional Resources
- Official GLM-4.7 Documentation
- llama.cpp GitHub
- Ollama Documentation
- Quantized models on Hugging Face
📄 License
This repository contains installation and configuration scripts. The GLM-4.7 model has its own license - consult the official repository.
🤝 Contributions
Contributions are welcome! Feel free to open issues or pull requests.
⚠️ Warnings
- Large models may take a long time to download (100GB+)
- First run may be slow while the model loads
- Make sure you have enough disk space before downloading
- On very limited hardware, consider using smaller models or cloud services