Temari N8N Unitats 6-10
IA, Escalabilitat, Seguretat i Casos d'Ús
Unitat 6: Integracions amb IA
6.1. AI Nodes a N8N
N8N té suport natiu per a IA amb nodes especialitzats.
Nodes d'IA disponibles:
- OpenAI (GPT-4, GPT-3.5, DALL-E, Whisper)
- Anthropic Claude
- Google PaLM/Gemini
- Hugging Face
- Cohere
- AI Agent (LangChain)
- Text Classifier
- Sentiment Analysis
Exemple: OpenAI GPT-4
[Trigger: New Support Ticket]
↓
[OpenAI Chat Model]
Model: gpt-4
System Message: "You are a customer support specialist..."
User Message: {{$json.ticket_description}}
Temperature: 0.3
Max Tokens: 500
↓
[Parse AI Response]
↓
[Update Ticket with AI Suggestion]
6.2. Casos d'ús amb IA
1. Anàlisi de Sentiment
Pipeline de sentiment analysis:
[Get Customer Reviews]
↓
[Split In Batches: 50]
↓
[OpenAI Chat Model]
Prompt: "Analyze sentiment of this review and return JSON:
{sentiment: 'positive'|'negative'|'neutral',
confidence: 0-1,
key_topics: []}"
Review: {{$json.review_text}}
↓
[Parse JSON Response]
↓
[IF: Negative sentiment]
├─ true → [Alert Customer Success Team]
└─ false → [Store in Analytics DB]
2. Extracció d'Informació
[Receive Invoice PDF]
↓
[Extract Text from PDF]
↓
[OpenAI Chat Model]
Prompt: "Extract invoice data as JSON:
{
invoice_number: '',
date: '',
vendor: '',
total_amount: 0,
line_items: []
}"
Text: {{$json.pdf_text}}
↓
[Validate Extracted Data]
↓
[Save to Accounting System]
3. Classificació de Textos
// Code Node: Batch classification
const texts = $input.all();
const batches = [];
// Agrupa en lots de 10
for (let i = 0; i < texts.length; i += 10) {
batches.push(texts.slice(i, i + 10));
}
// Classifica cada lot
const results = [];
for (const batch of batches) {
const prompt = `Classify these texts into categories: Tech, Finance, Health, Other
${batch.map((t, i) => `${i+1}. ${t.json.text}`).join('\n')}
Return JSON array: [{text_id: 1, category: 'Tech'}, ...]`;
const response = await openai.complete(prompt);
results.push(...response);
}
return results.map(r => ({json: r}));
4. Generació de Resums
[Daily News Articles] (100 articles)
↓
[Filter: Technology category]
↓
[OpenAI Chat Model]
System: "Summarize tech news in 2-3 sentences"
Articles: {{$json.articles}}
↓
[Combine Summaries]
↓
[Generate Email Newsletter]
↓
[Send to Subscribers]
6.3. AI en pipelines de dades
Enriquiment intel·ligent:
[Raw Customer Data]
↓
[OpenAI: Standardize Company Names]
Prompt: "Standardize company name: {{$json.company_raw}}"
Examples: "MSFT → Microsoft, GOOGL → Google"
↓
[OpenAI: Infer Industry]
Prompt: "Based on company name and description,
classify industry"
↓
[OpenAI: Generate Company Summary]
↓
[Enriched Data]
Detecció d'anomalies:
// Code Node: AI-powered anomaly detection
const metrics = $input.all();
const historicalData = metrics.slice(0, -1);
const currentData = metrics[metrics.length - 1];
const prompt = `
Historical metrics (mean±std):
${JSON.stringify(calculateStats(historicalData))}
Current metrics:
${JSON.stringify(currentData)}
Analyze if current metrics show anomalies. Return JSON:
{
is_anomaly: boolean,
anomalous_fields: [],
severity: 'low'|'medium'|'high',
explanation: ''
}
`;
const aiAnalysis = await openai.complete(prompt);
if (aiAnalysis.is_anomaly) {
// Trigger alert
}
Neteja intel·ligent:
[Messy Data]
↓
[AI Agent: Data Cleaning]
Tools:
- Detect and fix typos
- Standardize formats
- Infer missing values
- Remove duplicates (fuzzy matching)
↓
[Validate Cleaned Data]
↓
[Load to Clean Database]
Unitat 7: Escalabilitat i Producció
7.1. Queue Mode amb Redis
Configuració:
# docker-compose.yml
services:
redis:
image: redis:7-alpine
command: redis-server --appendonly yes
volumes:
- redis_data:/data
n8n-main:
image: n8nio/n8n
environment:
- EXECUTIONS_MODE=queue
- QUEUE_BULL_REDIS_HOST=redis
- QUEUE_BULL_REDIS_PORT=6379
- QUEUE_BULL_REDIS_DB=0
ports:
- "5678:5678"
n8n-worker-1:
image: n8nio/n8n
command: worker
environment:
- EXECUTIONS_MODE=queue
- QUEUE_BULL_REDIS_HOST=redis
n8n-worker-2:
image: n8nio/n8n
command: worker
environment:
- EXECUTIONS_MODE=queue
- QUEUE_BULL_REDIS_HOST=redis
n8n-worker-3:
image: n8nio/n8n
command: worker
environment:
- EXECUTIONS_MODE=queue
- QUEUE_BULL_REDIS_HOST=redis
Avantatges: - ✅ Execucions distribuïdes - ✅ Alta disponibilitat - ✅ Escalat horitzontal (afegir workers) - ✅ Isolation (worker crashes no afecten altres)
7.2. Optimització de workflows
Best practices:
1. Batch processing:
2. Caching:
[Check Redis Cache]
↓ miss
[Fetch from API] (slow)
↓
[Store in Cache: TTL 3600s]
↓
[Use cached data] (fast)
3. Parallel execution:
[Main Flow]
├─ [Branch 1: Process A] ─┐
├─ [Branch 2: Process B] ─┤
└─ [Branch 3: Process C] ─┴─> [Merge Results]
4. Evitar nested loops:
❌ Mal:
[Loop 1000 users]
└─> [Loop 100 orders per user] = 100,000 iterations
✅ Bé:
[Get all orders for all users: 1 query]
└─> [Group by user: 1 operation]
7.3. Monitoratge
Mètriques clau:
// Workflow execution metrics
{
workflow_id: "abc123",
execution_id: "exec_456",
start_time: "2026-02-09T10:00:00Z",
end_time: "2026-02-09T10:05:23Z",
duration_ms: 323000,
status: "success", // or "error"
nodes_executed: 12,
items_processed: 5000,
error_count: 0,
retry_count: 1
}
Alertes:
[Workflow Complete]
↓
[IF: Duration > 5min OR Errors > 0]
↓ true
[Slack: Alert DevOps]
Message: "⚠️ Workflow {{$workflow.name}} took {{$json.duration}}ms"
↓
[PagerDuty: Create Incident]
Log streaming:
environment:
- N8N_LOG_LEVEL=info
- N8N_LOG_OUTPUT=file
- N8N_LOG_FILE_LOCATION=/var/log/n8n/
# Filebeat config
filebeat.inputs:
- type: log
paths:
- /var/log/n8n/*.log
output.elasticsearch:
hosts: ["elasticsearch:9200"]
7.4. CI/CD
Export workflows:
# Exportar workflow
curl -X GET http://localhost:5678/api/v1/workflows/123 \
-H "X-N8N-API-KEY: $API_KEY" > workflow.json
# Importar workflow
curl -X POST http://localhost:5678/api/v1/workflows \
-H "X-N8N-API-KEY: $API_KEY" \
-H "Content-Type: application/json" \
-d @workflow.json
Git integration:
# Structure
repo/
├── workflows/
│ ├── etl_sales_daily.json
│ ├── sync_customers.json
│ └── reports_weekly.json
├── credentials/ (encrypted)
└── .github/workflows/
└── deploy.yml
# GitHub Actions
name: Deploy N8N Workflows
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Deploy to N8N
run: |
for workflow in workflows/*.json; do
curl -X POST $N8N_URL/api/v1/workflows \
-H "X-N8N-API-KEY: ${{ secrets.N8N_API_KEY }}" \
-d @$workflow
done
Unitat 8: Seguretat i Compliance
8.1. Seguretat
Configuració SSL/TLS:
# Nginx reverse proxy
server {
listen 443 ssl http2;
server_name n8n.company.com;
ssl_certificate /etc/ssl/certs/n8n.crt;
ssl_certificate_key /etc/ssl/private/n8n.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
location / {
proxy_pass http://n8n:5678;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
Encriptació de dades:
# Generar clau d'encriptació forta
openssl rand -hex 32
# Variables critiques
N8N_ENCRYPTION_KEY=a8f5f167f44f... # Credencials
N8N_JWT_SECRET=b9g6g278g55g... # Sessions
8.2. RBAC
Roles i permisos:
Owner:
✅ Gestionar usuaris
✅ Veure/editar tots els workflows
✅ Gestionar credencials globals
✅ Configuració del sistema
Editor:
✅ Crear/editar workflows propis
✅ Executar workflows
✅ Crear credencials pròpies
❌ Gestionar usuaris
Viewer:
✅ Veure workflows
✅ Veure executions history
❌ Editar workflows
❌ Accedir a credencials
8.3. Compliance
GDPR:
✅ Data residency: Self-host a EU
✅ Right to deletion:
- DELETE FROM executions WHERE user_id = X
- DELETE FROM credentials WHERE user_id = X
✅ Encryption at rest
✅ Audit logs de tots els accessos
Audit trail:
CREATE TABLE audit_log (
id SERIAL PRIMARY KEY,
timestamp TIMESTAMP DEFAULT NOW(),
user_id INTEGER,
action VARCHAR(50), -- 'workflow_created', 'credential_accessed'
resource_type VARCHAR(50),
resource_id VARCHAR(255),
ip_address VARCHAR(45),
user_agent TEXT,
details JSONB
);
-- Query auditoria
SELECT * FROM audit_log
WHERE action = 'credential_accessed'
AND resource_id = 'prod_database'
ORDER BY timestamp DESC;
Unitat 9: Casos d'Ús en Big Data
9.1. Ingesta en temps real
Architecture:
External System → [Webhook] → N8N → [Validate] → [Transform] → [Load to Data Lake]
↓
[Redis: Buffer]
↓
[Batch every 60s]
↓
[S3: Parquet files]
Implementació:
[Webhook Trigger]
Path: /events/ingest
↓
[Validate Schema]
↓
[Redis: Add to Buffer]
Key: events:buffer:{{$now.toFormat('yyyyMMddHHmm')}}
Value: {{$json}}
↓
[Respond: 202 Accepted]
---
[Schedule: Every 60s]
↓
[Redis: Get Buffered Events]
↓
[IF: Count > 0]
↓
[Transform to Parquet]
↓
[S3: Upload]
Path: s3://data-lake/events/{{$now.toFormat('yyyy/MM/dd/HH')}}/batch_{{$now.toUnixInteger()}}.parquet
↓
[Redis: Clear Buffer]
9.2. Data Warehouse Pipeline
Complete ETL Pipeline:
[Schedule: Daily 2AM]
↓
[Extract from Sources]
├─ [Salesforce API]
├─ [PostgreSQL DB]
└─ [S3 Files]
↓
[Merge All Sources]
↓
[Transform & Clean]
├─ Deduplicate
├─ Validate
├─ Enrich
└─ Normalize
↓
[Load to Staging]
PostgreSQL: staging.raw_data
↓
[dbt: Transform in DWH]
Run models:
- staging models
- fact tables
- dimension tables
↓
[Data Quality Checks]
↓
[IF: All Pass]
├─ true → [Promote to Production]
└─ false → [Rollback + Alert]
↓
[Update Metadata]
↓
[Send Success Report]
9.3. ML Operations
ML Pipeline:
[Schedule: Weekly]
↓
[Fetch Training Data]
BigQuery: last 90 days
↓
[Data Preparation]
- Feature engineering
- Train/test split
↓
[Trigger Training Job]
SageMaker/Vertex AI
↓
[Wait for Completion]
↓
[Evaluate Model]
Metrics: accuracy, F1, AUC
↓
[IF: Metrics > Threshold]
├─ true → [Deploy New Model]
│ ↓
│ [Update Model Registry]
│ ↓
│ [Switch Production Traffic]
│
└─ false → [Alert ML Team]
↓
[Keep Current Model]
9.4. Automatització de Reports
[Schedule: Daily 8AM]
↓
[BigQuery: Run Analytics Queries]
- Daily sales by region
- Top products
- Customer growth
↓
[Generate Visualizations]
Create charts with Chart.js
↓
[Google Sheets: Update Dashboard]
Spreadsheet ID: abc123
Range: 'Daily Metrics'!A1:Z100
↓
[Generate PDF Report]
Template: company_report.html
Data: {{$json.metrics}}
↓
[Send Email to Stakeholders]
To: executives@company.com
Subject: Daily Business Report - {{$today}}
Attachments: report.pdf
↓
[Slack: Post Summary]
Channel: #daily-metrics
Message: "📊 Daily report ready!"
Unitat 10: Projecte Final
10.1. Estructura del Projecte
Exemple: E-commerce Analytics Platform
Architecture:
Sources:
├─ Shopify API (orders, products, customers)
├─ Google Analytics (web traffic)
├─ Email Provider (campaign metrics)
└─ CRM Database (customer data)
↓
N8N Workflows:
├─ ETL Pipeline (extract, transform, load)
├─ Real-time Events (webhooks)
├─ ML Models (recommendations, churn prediction)
└─ Reporting (automated dashboards)
↓
Storage:
├─ PostgreSQL (staging)
├─ BigQuery (warehouse)
└─ Redis (cache)
↓
Outputs:
├─ Dashboards (Metabase/Looker)
├─ Email Reports
└─ Slack Alerts
10.2. Workflows Principals
1. Daily ETL Pipeline:
[Schedule: 2AM Daily]
↓
[Shopify: Get Yesterday's Orders]
↓
[Transform Order Data]
↓
[Enrich with Customer Data]
↓
[Calculate Metrics]
- Total revenue
- AOV (Average Order Value)
- Products sold
↓
[Load to PostgreSQL Staging]
↓
[Trigger dbt Transformation]
↓
[Load to BigQuery DWH]
↓
[Data Quality Checks]
↓
[IF: Success]
└─> [Update Dashboard]
[Send Report Email]
2. Real-time Order Webhook:
[Webhook: New Order]
↓
[Validate Order Data]
↓
[Enrich Customer Profile]
↓
[ML: Predict Churn Risk]
↓
[IF: High Value Customer]
├─> [Slack: Notify Sales Team]
└─> [CRM: Create Follow-up Task]
↓
[Store in Real-time Table]
↓
[Update Dashboard Cache]
3. Weekly Report:
[Schedule: Monday 9AM]
↓
[BigQuery: Weekly Metrics]
↓
[Generate Visualizations]
↓
[AI: Generate Insights]
Prompt: "Analyze these metrics and provide insights"
↓
[Create PDF Report]
↓
[Send to Management]
↓
[Post to Slack]
10.3. Criteris d'avaluació
Checklist del projecte:
✅ Arquitectura
- Diagrama complet
- Justificació de decisions tècniques
- Consideracions d'escalabilitat
✅ Implementació
- Almenys 5 workflows interconnectats
- Error handling robusto
- Logging adequat
- Testing amb dades mock
✅ ETL Pipeline
- Extract de 3+ fonts
- Transformacions complexes
- Load a DWH
- Idempotència
✅ Qualitat
- Codi net i comentat
- Noms descriptius
- Reutilització de credencials
- Best practices aplicades
✅ Documentació
- README amb setup instructions
- Manual d'usuari
- Manual tècnic
- Diagrames
✅ Presentació
- Video demo (5-10 min)
- Slides tècnics
- Explicació de reptes i solucions
Resum Final del Temari Complet
Hem cobert:
Unitats 1-2: Introducció, arquitectura i instal·lació Unitats 3-4: Conceptes fonamentals i nodes avançats Unitat 5: ETL i data pipelines Unitat 6: Integracions amb IA Unitat 7: Escalabilitat i producció Unitat 8: Seguretat i compliance Unitat 9: Casos d'ús específics de Big Data Unitat 10: Projecte final integrador
Habilitats adquirides: - ✅ Crear workflows complexos amb N8N - ✅ Implementar pipelines ETL robustos - ✅ Integrar IA en automatitzacions - ✅ Escalar N8N per a producció - ✅ Assegurar workflows i complir amb regulacions - ✅ Aplicar N8N a casos reals de Big Data
Bones pràctiques apresses: - Sempre fer error handling - Usar batch processing per a grans volums - Implementar idempotència - Monitorar i fer logging - Documentar tot - Testejar amb dades mock abans de producció
Aquest temari proporciona una base sòlida per utilitzar N8N professional ment en entorns de Big Data.