Salta el contingut

Temari N8N Unitats 6-10

IA, Escalabilitat, Seguretat i Casos d'Ús


Unitat 6: Integracions amb IA

6.1. AI Nodes a N8N

N8N té suport natiu per a IA amb nodes especialitzats.

Nodes d'IA disponibles:

- OpenAI (GPT-4, GPT-3.5, DALL-E, Whisper)
- Anthropic Claude
- Google PaLM/Gemini
- Hugging Face
- Cohere
- AI Agent (LangChain)
- Text Classifier
- Sentiment Analysis

Exemple: OpenAI GPT-4

[Trigger: New Support Ticket]
[OpenAI Chat Model]
  Model: gpt-4
  System Message: "You are a customer support specialist..."
  User Message: {{$json.ticket_description}}
  Temperature: 0.3
  Max Tokens: 500
[Parse AI Response]
[Update Ticket with AI Suggestion]

6.2. Casos d'ús amb IA

1. Anàlisi de Sentiment

Pipeline de sentiment analysis:

[Get Customer Reviews]
[Split In Batches: 50]
[OpenAI Chat Model]
  Prompt: "Analyze sentiment of this review and return JSON:
          {sentiment: 'positive'|'negative'|'neutral', 
           confidence: 0-1,
           key_topics: []}"
  Review: {{$json.review_text}}
[Parse JSON Response]
[IF: Negative sentiment]
  ├─ true → [Alert Customer Success Team]
  └─ false → [Store in Analytics DB]

2. Extracció d'Informació

[Receive Invoice PDF]
[Extract Text from PDF]
[OpenAI Chat Model]
  Prompt: "Extract invoice data as JSON:
          {
            invoice_number: '',
            date: '',
            vendor: '',
            total_amount: 0,
            line_items: []
          }"
  Text: {{$json.pdf_text}}
[Validate Extracted Data]
[Save to Accounting System]

3. Classificació de Textos

// Code Node: Batch classification
const texts = $input.all();
const batches = [];

// Agrupa en lots de 10
for (let i = 0; i < texts.length; i += 10) {
  batches.push(texts.slice(i, i + 10));
}

// Classifica cada lot
const results = [];
for (const batch of batches) {
  const prompt = `Classify these texts into categories: Tech, Finance, Health, Other
  ${batch.map((t, i) => `${i+1}. ${t.json.text}`).join('\n')}
  Return JSON array: [{text_id: 1, category: 'Tech'}, ...]`;

  const response = await openai.complete(prompt);
  results.push(...response);
}

return results.map(r => ({json: r}));

4. Generació de Resums

[Daily News Articles] (100 articles)
[Filter: Technology category]
[OpenAI Chat Model]
  System: "Summarize tech news in 2-3 sentences"
  Articles: {{$json.articles}}
[Combine Summaries]
[Generate Email Newsletter]
[Send to Subscribers]

6.3. AI en pipelines de dades

Enriquiment intel·ligent:

[Raw Customer Data]
[OpenAI: Standardize Company Names]
  Prompt: "Standardize company name: {{$json.company_raw}}"
  Examples: "MSFT → Microsoft, GOOGL → Google"
[OpenAI: Infer Industry]
  Prompt: "Based on company name and description, 
          classify industry"
[OpenAI: Generate Company Summary]
[Enriched Data]

Detecció d'anomalies:

// Code Node: AI-powered anomaly detection
const metrics = $input.all();
const historicalData = metrics.slice(0, -1);
const currentData = metrics[metrics.length - 1];

const prompt = `
Historical metrics (mean±std):
${JSON.stringify(calculateStats(historicalData))}

Current metrics:
${JSON.stringify(currentData)}

Analyze if current metrics show anomalies. Return JSON:
{
  is_anomaly: boolean,
  anomalous_fields: [],
  severity: 'low'|'medium'|'high',
  explanation: ''
}
`;

const aiAnalysis = await openai.complete(prompt);

if (aiAnalysis.is_anomaly) {
  // Trigger alert
}

Neteja intel·ligent:

[Messy Data]
[AI Agent: Data Cleaning]
  Tools:
    - Detect and fix typos
    - Standardize formats
    - Infer missing values
    - Remove duplicates (fuzzy matching)
[Validate Cleaned Data]
[Load to Clean Database]

Unitat 7: Escalabilitat i Producció

7.1. Queue Mode amb Redis

Configuració:

# docker-compose.yml
services:
  redis:
    image: redis:7-alpine
    command: redis-server --appendonly yes
    volumes:
      - redis_data:/data

  n8n-main:
    image: n8nio/n8n
    environment:
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=redis
      - QUEUE_BULL_REDIS_PORT=6379
      - QUEUE_BULL_REDIS_DB=0
    ports:
      - "5678:5678"

  n8n-worker-1:
    image: n8nio/n8n
    command: worker
    environment:
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=redis

  n8n-worker-2:
    image: n8nio/n8n
    command: worker
    environment:
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=redis

  n8n-worker-3:
    image: n8nio/n8n
    command: worker
    environment:
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=redis

Avantatges: - ✅ Execucions distribuïdes - ✅ Alta disponibilitat - ✅ Escalat horitzontal (afegir workers) - ✅ Isolation (worker crashes no afecten altres)

7.2. Optimització de workflows

Best practices:

1. Batch processing:

❌ Mal: 10,000 HTTP requests individuals
✅ Bé: Split In Batches: 100 → 100 requests de 100 items

2. Caching:

[Check Redis Cache]
  ↓ miss
[Fetch from API] (slow)
[Store in Cache: TTL 3600s]
[Use cached data] (fast)

3. Parallel execution:

[Main Flow]
  ├─ [Branch 1: Process A] ─┐
  ├─ [Branch 2: Process B] ─┤
  └─ [Branch 3: Process C] ─┴─> [Merge Results]

4. Evitar nested loops:

❌ Mal:
[Loop 1000 users]
  └─> [Loop 100 orders per user] = 100,000 iterations

✅ Bé:
[Get all orders for all users: 1 query]
  └─> [Group by user: 1 operation]

7.3. Monitoratge

Mètriques clau:

// Workflow execution metrics
{
  workflow_id: "abc123",
  execution_id: "exec_456",
  start_time: "2026-02-09T10:00:00Z",
  end_time: "2026-02-09T10:05:23Z",
  duration_ms: 323000,
  status: "success", // or "error"
  nodes_executed: 12,
  items_processed: 5000,
  error_count: 0,
  retry_count: 1
}

Alertes:

[Workflow Complete]
[IF: Duration > 5min OR Errors > 0]
  ↓ true
[Slack: Alert DevOps]
  Message: "⚠️ Workflow {{$workflow.name}} took {{$json.duration}}ms"
[PagerDuty: Create Incident]

Log streaming:

environment:
  - N8N_LOG_LEVEL=info
  - N8N_LOG_OUTPUT=file
  - N8N_LOG_FILE_LOCATION=/var/log/n8n/

# Filebeat config
filebeat.inputs:
- type: log
  paths:
    - /var/log/n8n/*.log

output.elasticsearch:
  hosts: ["elasticsearch:9200"]

7.4. CI/CD

Export workflows:

# Exportar workflow
curl -X GET http://localhost:5678/api/v1/workflows/123 \
  -H "X-N8N-API-KEY: $API_KEY" > workflow.json

# Importar workflow
curl -X POST http://localhost:5678/api/v1/workflows \
  -H "X-N8N-API-KEY: $API_KEY" \
  -H "Content-Type: application/json" \
  -d @workflow.json

Git integration:

# Structure
repo/
├── workflows/
   ├── etl_sales_daily.json
   ├── sync_customers.json
   └── reports_weekly.json
├── credentials/ (encrypted)
└── .github/workflows/
    └── deploy.yml

# GitHub Actions
name: Deploy N8N Workflows
on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Deploy to N8N
        run: |
          for workflow in workflows/*.json; do
            curl -X POST $N8N_URL/api/v1/workflows \
              -H "X-N8N-API-KEY: ${{ secrets.N8N_API_KEY }}" \
              -d @$workflow
          done

Unitat 8: Seguretat i Compliance

8.1. Seguretat

Configuració SSL/TLS:

# Nginx reverse proxy
server {
    listen 443 ssl http2;
    server_name n8n.company.com;

    ssl_certificate /etc/ssl/certs/n8n.crt;
    ssl_certificate_key /etc/ssl/private/n8n.key;

    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;

    location / {
        proxy_pass http://n8n:5678;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Encriptació de dades:

# Generar clau d'encriptació forta
openssl rand -hex 32

# Variables critiques
N8N_ENCRYPTION_KEY=a8f5f167f44f...  # Credencials
N8N_JWT_SECRET=b9g6g278g55g...      # Sessions

8.2. RBAC

Roles i permisos:

Owner:
  ✅ Gestionar usuaris
  ✅ Veure/editar tots els workflows
  ✅ Gestionar credencials globals
  ✅ Configuració del sistema

Editor:
  ✅ Crear/editar workflows propis
  ✅ Executar workflows
  ✅ Crear credencials pròpies
  ❌ Gestionar usuaris

Viewer:
  ✅ Veure workflows
  ✅ Veure executions history
  ❌ Editar workflows
  ❌ Accedir a credencials

8.3. Compliance

GDPR:

✅ Data residency: Self-host a EU
✅ Right to deletion:
   - DELETE FROM executions WHERE user_id = X
   - DELETE FROM credentials WHERE user_id = X
✅ Encryption at rest
✅ Audit logs de tots els accessos

Audit trail:

CREATE TABLE audit_log (
    id SERIAL PRIMARY KEY,
    timestamp TIMESTAMP DEFAULT NOW(),
    user_id INTEGER,
    action VARCHAR(50),  -- 'workflow_created', 'credential_accessed'
    resource_type VARCHAR(50),
    resource_id VARCHAR(255),
    ip_address VARCHAR(45),
    user_agent TEXT,
    details JSONB
);

-- Query auditoria
SELECT * FROM audit_log 
WHERE action = 'credential_accessed'
  AND resource_id = 'prod_database'
ORDER BY timestamp DESC;

Unitat 9: Casos d'Ús en Big Data

9.1. Ingesta en temps real

Architecture:

External System → [Webhook] → N8N → [Validate] → [Transform] → [Load to Data Lake]
                          [Redis: Buffer]
                        [Batch every 60s]
                         [S3: Parquet files]

Implementació:

[Webhook Trigger]
  Path: /events/ingest
[Validate Schema]
[Redis: Add to Buffer]
  Key: events:buffer:{{$now.toFormat('yyyyMMddHHmm')}}
  Value: {{$json}}
[Respond: 202 Accepted]

---

[Schedule: Every 60s]
[Redis: Get Buffered Events]
[IF: Count > 0]
[Transform to Parquet]
[S3: Upload]
  Path: s3://data-lake/events/{{$now.toFormat('yyyy/MM/dd/HH')}}/batch_{{$now.toUnixInteger()}}.parquet
[Redis: Clear Buffer]

9.2. Data Warehouse Pipeline

Complete ETL Pipeline:

[Schedule: Daily 2AM]
[Extract from Sources]
  ├─ [Salesforce API]
  ├─ [PostgreSQL DB]
  └─ [S3 Files]
[Merge All Sources]
[Transform & Clean]
  ├─ Deduplicate
  ├─ Validate
  ├─ Enrich
  └─ Normalize
[Load to Staging]
  PostgreSQL: staging.raw_data
[dbt: Transform in DWH]
  Run models:
    - staging models
    - fact tables
    - dimension tables
[Data Quality Checks]
[IF: All Pass]
  ├─ true → [Promote to Production]
  └─ false → [Rollback + Alert]
[Update Metadata]
[Send Success Report]

9.3. ML Operations

ML Pipeline:

[Schedule: Weekly]
[Fetch Training Data]
  BigQuery: last 90 days
[Data Preparation]
  - Feature engineering
  - Train/test split
[Trigger Training Job]
  SageMaker/Vertex AI
[Wait for Completion]
[Evaluate Model]
  Metrics: accuracy, F1, AUC
[IF: Metrics > Threshold]
  ├─ true → [Deploy New Model]
  │          ↓
  │        [Update Model Registry]
  │          ↓
  │        [Switch Production Traffic]
  └─ false → [Alert ML Team]
            [Keep Current Model]

9.4. Automatització de Reports

[Schedule: Daily 8AM]
[BigQuery: Run Analytics Queries]
  - Daily sales by region
  - Top products
  - Customer growth
[Generate Visualizations]
  Create charts with Chart.js
[Google Sheets: Update Dashboard]
  Spreadsheet ID: abc123
  Range: 'Daily Metrics'!A1:Z100
[Generate PDF Report]
  Template: company_report.html
  Data: {{$json.metrics}}
[Send Email to Stakeholders]
  To: executives@company.com
  Subject: Daily Business Report - {{$today}}
  Attachments: report.pdf
[Slack: Post Summary]
  Channel: #daily-metrics
  Message: "📊 Daily report ready!"

Unitat 10: Projecte Final

10.1. Estructura del Projecte

Exemple: E-commerce Analytics Platform

Architecture:

Sources:
  ├─ Shopify API (orders, products, customers)
  ├─ Google Analytics (web traffic)
  ├─ Email Provider (campaign metrics)
  └─ CRM Database (customer data)
N8N Workflows:
  ├─ ETL Pipeline (extract, transform, load)
  ├─ Real-time Events (webhooks)
  ├─ ML Models (recommendations, churn prediction)
  └─ Reporting (automated dashboards)
Storage:
  ├─ PostgreSQL (staging)
  ├─ BigQuery (warehouse)
  └─ Redis (cache)
Outputs:
  ├─ Dashboards (Metabase/Looker)
  ├─ Email Reports
  └─ Slack Alerts

10.2. Workflows Principals

1. Daily ETL Pipeline:

[Schedule: 2AM Daily]
[Shopify: Get Yesterday's Orders]
[Transform Order Data]
[Enrich with Customer Data]
[Calculate Metrics]
  - Total revenue
  - AOV (Average Order Value)
  - Products sold
[Load to PostgreSQL Staging]
[Trigger dbt Transformation]
[Load to BigQuery DWH]
[Data Quality Checks]
[IF: Success]
  └─> [Update Dashboard]
      [Send Report Email]

2. Real-time Order Webhook:

[Webhook: New Order]
[Validate Order Data]
[Enrich Customer Profile]
[ML: Predict Churn Risk]
[IF: High Value Customer]
  ├─> [Slack: Notify Sales Team]
  └─> [CRM: Create Follow-up Task]
[Store in Real-time Table]
[Update Dashboard Cache]

3. Weekly Report:

[Schedule: Monday 9AM]
[BigQuery: Weekly Metrics]
[Generate Visualizations]
[AI: Generate Insights]
  Prompt: "Analyze these metrics and provide insights"
[Create PDF Report]
[Send to Management]
[Post to Slack]

10.3. Criteris d'avaluació

Checklist del projecte:

✅ Arquitectura
  - Diagrama complet
  - Justificació de decisions tècniques
  - Consideracions d'escalabilitat

✅ Implementació
  - Almenys 5 workflows interconnectats
  - Error handling robusto
  - Logging adequat
  - Testing amb dades mock

✅ ETL Pipeline
  - Extract de 3+ fonts
  - Transformacions complexes
  - Load a DWH
  - Idempotència

✅ Qualitat
  - Codi net i comentat
  - Noms descriptius
  - Reutilització de credencials
  - Best practices aplicades

✅ Documentació
  - README amb setup instructions
  - Manual d'usuari
  - Manual tècnic
  - Diagrames

✅ Presentació
  - Video demo (5-10 min)
  - Slides tècnics
  - Explicació de reptes i solucions

Resum Final del Temari Complet

Hem cobert:

Unitats 1-2: Introducció, arquitectura i instal·lació Unitats 3-4: Conceptes fonamentals i nodes avançats Unitat 5: ETL i data pipelines Unitat 6: Integracions amb IA Unitat 7: Escalabilitat i producció Unitat 8: Seguretat i compliance Unitat 9: Casos d'ús específics de Big Data Unitat 10: Projecte final integrador

Habilitats adquirides: - ✅ Crear workflows complexos amb N8N - ✅ Implementar pipelines ETL robustos - ✅ Integrar IA en automatitzacions - ✅ Escalar N8N per a producció - ✅ Assegurar workflows i complir amb regulacions - ✅ Aplicar N8N a casos reals de Big Data

Bones pràctiques apresses: - Sempre fer error handling - Usar batch processing per a grans volums - Implementar idempotència - Monitorar i fer logging - Documentar tot - Testejar amb dades mock abans de producció

Aquest temari proporciona una base sòlida per utilitzar N8N professional ment en entorns de Big Data.