在业务运维场景中,需要对核心的API接口进行拨测。而各个接口需要传递的参数或者接口之间的依赖是比较复杂的,通常接口之间都是通过链式请求来完成一个业务场景。常见的就是先登录,拿到token以后,再进行后续的API请求。postman提供了基于GUI的方式完成这种场景适配,但是对于运维来讲,需要定时的基于策略的形式来对API进行监控。本篇文章就带你从0-1打造API监控体系。
1. Postman使用方法
2. Docker基础知识
1. 从postman导出collection
以下文件以拨测httpbin.org为例,在Postman的GUI工具中导出拨测的json文件(httpbin.json)。示例中包含两个接口,一个模拟认证,一个模拟接口请求
2. 将导出的文件放到docker中运行
docker run -d -p 8080:8080 -v ./httpbin.json:/runner/collection.json kevinniu666/postman-prometheus:1.0.0
3. 获取拨测指标
curl 10.128.120.52:8080/metrics
# TYPE postman_lifetime_runs_total counter postman_lifetime_runs_total{collection="httpbin"} 1 # TYPE postman_lifetime_iterations_total counter postman_lifetime_iterations_total{collection="httpbin"} 1 # TYPE postman_lifetime_requests_total counter postman_lifetime_requests_total{collection="httpbin"} 2 # TYPE postman_stats_iterations_total gauge postman_stats_iterations_total{collection="httpbin"} 1 # TYPE postman_stats_iterations_failed gauge postman_stats_iterations_failed{collection="httpbin"} 0 # TYPE postman_stats_requests_total gauge postman_stats_requests_total{collection="httpbin"} 2 # TYPE postman_stats_requests_failed gauge postman_stats_requests_failed{collection="httpbin"} 0 # TYPE postman_stats_tests_total gauge postman_stats_tests_total{collection="httpbin"} 2 # TYPE postman_stats_tests_failed gauge postman_stats_tests_failed{collection="httpbin"} 0 # TYPE postman_stats_test_scripts_total gauge postman_stats_test_scripts_total{collection="httpbin"} 4 # TYPE postman_stats_test_scripts_failed gauge postman_stats_test_scripts_failed{collection="httpbin"} 0 # TYPE postman_stats_assertions_total gauge postman_stats_assertions_total{collection="httpbin"} 3 # TYPE postman_stats_assertions_failed gauge postman_stats_assertions_failed{collection="httpbin"} 0 # TYPE postman_stats_transfered_bytes_total gauge postman_stats_transfered_bytes_total{collection="httpbin"} 794 # TYPE postman_stats_resp_avg gauge postman_stats_resp_avg{collection="httpbin"} 541 # TYPE postman_stats_resp_min gauge postman_stats_resp_min{collection="httpbin"} 494 # TYPE postman_stats_resp_max gauge postman_stats_resp_max{collection="httpbin"} 588 # TYPE postman_request_status_code gauge postman_request_status_code{request_name="authentication",iteration="0",collection="httpbin"} 200 # TYPE postman_request_resp_time gauge postman_request_resp_time{request_name="authentication",iteration="0",collection="httpbin"} 588 # TYPE postman_request_resp_size gauge postman_request_resp_size{request_name="authentication",iteration="0",collection="httpbin"} 54 # TYPE postman_request_status_ok gauge postman_request_status_ok{request_name="authentication",iteration="0",collection="httpbin"} 1 # TYPE postman_request_failed_assertions gauge postman_request_failed_assertions{request_name="authentication",iteration="0",collection="httpbin"} 0 # TYPE postman_request_total_assertions gauge postman_request_total_assertions{request_name="authentication",iteration="0",collection="httpbin"} 1 # TYPE postman_request_status_code gauge postman_request_status_code{request_name="business-request",iteration="0",collection="httpbin"} 200 # TYPE postman_request_resp_time gauge postman_request_resp_time{request_name="business-request",iteration="0",collection="httpbin"} 494 # TYPE postman_request_resp_size gauge postman_request_resp_size{request_name="business-request",iteration="0",collection="httpbin"} 740 # TYPE postman_request_status_ok gauge postman_request_status_ok{request_name="business-request",iteration="0",collection="httpbin"} 1 # TYPE postman_request_failed_assertions gauge postman_request_failed_assertions{request_name="business-request",iteration="0",collection="httpbin"} 0 # TYPE postman_request_total_assertions gauge postman_request_total_assertions{request_name="business-request",iteration="0",collection="httpbin"} 2
4. 将指标接入prometheus中,prometheus的配置文件中添加以下信息。
- job_name: http-api-monitor scrape_interval: 15s scrape_timeout: 10s metrics_path: /metrics scheme: http static_configs: - targets: - 10.128.120.52:8080 #该ip为容器运行节点IP地址 labels: usage: "httpbin接口拨测"
5. 配置prometheus告警规则,详细的规则,大家可以根据prometheus的指标自己来设置。
- alert: 接口返回码异常 expr: postman_request_status_code != 200 for: 1m labels: severity: error annotations: summary: "接口响应代码非200" description: "后端接口拨测失败" - alert: 接口返回内容判定失败 expr: postman_request_failed_assertions != 0 for: 1m labels: severity: error annotations: summary: "接口返回内容判定失败" description: "后端接口postman测试未通过"
接下来的事情就是对接Alertmanager,并把告警发送给运维了。如果需要Grafana的图表,项目中有说明。源代码地址:
GitHub - kevinniu666/postman-prometheus: Run Postman collections continuously and export results as Prometheus metrics
整个实现方案的核心是这个容器,它将postman的运行转化成为了prometheus可识别的指标。这个容器的源代码已经在文档中提供了。如果你对nodejs有了解,可以自己去修改源码。该容器可以通过环境变量控制一些行为,列举如下:
放一张产线的业务拨测图表:
作者:一直学下去
原文链接:https://blog.csdn.net/lwlfox/article/details/127023067