Portal - Outgoing HTTP Error Rate Alerts¶

This guide helps the support team diagnose and resolve Outgoing HTTP server request failures detected via the Outgoing HTTP Error Rate Alert in Grafana for the portal service.

Root Cause¶

Portal Outgoing HTTP error rate alerts occur when the portal service experiences failures in making HTTP requests to services or APIs. Common causes include:

Service Unavailability: Target services (Ranger API, Scheme Server, Ops-Server, Dataserver) are down or unresponsive
Network/Proxy Issues: Connectivity problems or proxy configuration errors preventing outbound requests
Resource Constraints: Memory pressure or connection pool exhaustion affecting outbound calls
Service Dependency Failures: Services returning errors or not responding as expected
Timeout Issues: Services taking too long to respond, causing request timeouts

Solution¶

Step 1: Identify Failing Endpoint

Use the alert metadata in Grafana (like URI, Method, Status) to identify:

Which portal endpoint is failing (Example: Ranger API (/ranger/*), Scheme Server(/peg/*), Ops-Server (/ops-server/*))
What request caused the issue (Example: GET /ranger/service/xusers/users → 500 Internal Server Error)

Step 2: Grafana Dashboard Checks

Outgoing HTTP Request Rate

This panel shows how many requests per second Portal is making to services over the last 5 minutes.

What It Shows:
- Request volume to each service (Ranger, PEG, Ops-Server)
- Breakdown by HTTP status code (200, 404, 500, etc.)
- Breakdown by HTTP method (GET, POST, PUT, etc.)
- Specific endpoints being called
When to Check:
- To see if Portal is making unusually high or low requests to services
- To identify which specific endpoints are generating the most traffic
- To correlate request spikes with error increases
Outgoing HTTP Response Time

This panel shows how long services take to respond to Portal's requests.

What It Shows:
- Average response time for each service
- Response time trends over time
- Breakdown by endpoint and status code
- Identifies slow-performing services
When to Check:
- If Portal UI feels slow or unresponsive
- To identify which service is causing performance issues
- To see if response times correlate with error rates
Outgoing Connection Status

This panel displays the current proxy connection health between Portal and services.

What It Shows:
- Ranger Proxy: Portal → Ranger API communication status (Connected/Disconnected/NA)
- PEG Proxy: Portal → PEG/Scheme Server communication status (Connected/Disconnected/NA)
- Ops Server Proxy: Portal → Ops-Server communication status (Connected/Disconnected/NA)
Status Indicators:
- Connected (Green): Portal can successfully communicate with the service
- Disconnected (Red): Portal cannot reach the service (network/auth/service down)
- NA (Gray): Service not configured or monitoring not available
When to Check:
- If you see 404/503/504 errors in outgoing requests
- When Portal features dependent on services aren't working
- To verify which specific service is causing connectivity issues

Step 3: Apply Quick Fixes Based on Common Error Pattern

Error Code	Likely Cause	Quick Fix
400	Invalid request parameters/malformed JSON	Validate input parameters and request format
401/419	Token expired	Verify service account tokens and check service health
404	Endpoint not found/resource missing	Verify URL configuration and resource existence
500	Internal service error	Proceed to Escalation Checklist for further investigation
503/504	Service unavailable	Check service health

Escalation Checklist¶

If the issue cannot be resolved through the specific troubleshooting guides, escalate it to the appropriate team with the following details:

Timestamp of the error : Include the exact time the alert was triggered
Grafana dashboard and alert screenshots :
- Grafana → Dashboards → Portal folder → Portal Dashboard
- Grafana → Alerting → Alert rules → Outgoing HTTP Error Rate Alerts.

Portal Service Logs: Include any logs from the Portal client-side actions, or test steps that reproduce the issue

Option 1: Download Log from Diagnostic Portal (Recommended)

Open Diagnostic Portal and go to Dashboard → Services Tab
Type "portal" in the service column input search box
Click on the portal service to open its details page
Find and click on a pod that shows "active" status
Click the "Logs" tab on the pod details page
Click "Download Logs" button to save the logs
If you see multiple portal pods with "active" status, repeat steps 4-6 for each one

Option 2: Manual Log Collection (If Diagnostic service is not enabled)

Bash
# Create log archive
kubectl exec -it <POD> -n <NAMESPACE> -- bash -c "cd /opt/privacera/portal/logs/ && tar -czf portal-logs.tar.gz *.log"

# Copy the fixed-name archive
kubectl cp <POD>:/opt/privacera/portal/logs/portal-logs.tar.gz ./portal-logs.tar.gz -n <NAMESPACE>

# Extract logs
tar -xzf portal-logs.tar.gz

Current portal configuration details : Configuration settings and deployment information
Relevant user actions : Actions leading up to the error

For additional assistance, see How to Contact Support for detailed guidance on reaching out to the support team.

Back to: Troubleshooting Overview