Portal - Incoming HTTP Error Rate Alerts¶
This guide assists the support team in diagnosing and resolving incoming HTTP server request failures triggered by the "Incoming HTTP Error Rate" alert in Grafana for the Portal service.
Root Cause¶
"Portal Incoming HTTP Error Rate" alerts are triggered when the Portal service encounters failures while processing HTTP requests from users or external systems. Common causes include:
- Database Connection Issues: Exhausted connection pool or unavailable database service
- Authentication Problems: Expired user sessions or invalid authentication tokens
- Permission Errors: Insufficient user permissions to access specific resources
- Resource Availability: Missing or misconfigured endpoints or resources
- Application Logic Errors: Internal exceptions such as null pointer errors
- Invalid Request Parameters: Malformed JSON or improperly structured request parameters
- Service Overload: High traffic or limited system resources impacting performance
Solution¶
Step 1: Identify Failing Endpoint
Use the alert metadata in Grafana (like URI
, Method
, Status
) to identify:
- Which portal endpoint is failing (Example:
/api/users
,/api/data/dic/systems
,/api/users/token
) - What request caused the issue (Example:
GET /api/users
→500 Internal Server Error
)
Step 2: Apply Quick Fixes Based on Error Pattern
Error Pattern | Likely Cause | Quick Fix |
---|---|---|
400 + ValidationException | Invalid request parameters/malformed JSON | Validate input parameters and request format |
401 + Session Expired | User session timeout | Check session configuration or ask user to re-login |
403 + AccessDeniedException | User lacks required permissions | Check user role and permission: 1. Login to portal and goto to Settings → User Management 2. Check user role and associated permission 4. Assign appropriate role using Portal User Roles |
404 + ResourceNotFoundException | Resource not found/endpoint missing | Verify URL configuration and resource existence |
500 + Internal Server Error | Application internal error or unexpected condition | Proceed to Escalation Checklist for further investigation |
Escalation Checklist¶
If the issue cannot be resolved through the specific troubleshooting guides, escalate it to the appropriate team with the following details:
- Timestamp of the error : Include the exact time the alert was triggered
- Grafana dashboard and alert screenshots :
- Grafana → Dashboards → Portal folder → Portal Dashboard
- Grafana → Alerting → Alert rules → Incoming HTTP Error Rate Alerts.
-
Portal Service Logs: Include any logs from the Portal client-side actions, or test steps that reproduce the issue
Option 1: Download Log from Diagnostic Portal (Recommended)
- Open Diagnostic Portal and go to Dashboard → Services Tab
- Type "portal" in the service column input search box
- Click on the portal service to open its details page
- Find and click on a pod that shows "active" status
- Click the "Logs" tab on the pod details page
- Click "Download Logs" button to save the logs
- If you see multiple portal pods with "active" status, repeat steps 4-6 for each one
Option 2: Manual Log Collection (If Diagnostic service is not enabled)
-
Current portal configuration details : Configuration settings and deployment information
- Relevant user actions : Actions leading up to the error
For additional assistance, see How to Contact Support for detailed guidance on reaching out to the support team.
- Back to: Troubleshooting Overview