Krishnakumar Chandran

Alert automation and self-healing

ยท 129 words ยท 1 minutes to read

TCS Logo

Highlights ๐Ÿ”—

  • Reduced duplicate alerts by 80%
  • Smart handling of alerts by using Probable cause analysis to find the root cause and fix them, ignoring other dependant tasks
  • Created adapters to integrate ignio with ServiceNow, Solarwinds and AppDynamics using REST API
  • Created dynamic profiling of systems, VMs, Applications, Databases, Network and storage entities
  • Using dynamic profiles, predicting if the alert is valid for the given conditions
  • Auto verify and apply self-healing scripts to address those alerts
  • Automated more than 90% of repeated IT Alert and hence enabling technical team to work on solutioning rather than monitoring and troubleshooting

Problem Statement ๐Ÿ”—

  • Aim is to reduce workload of IT Operations and Monitoring team, by reducing duplicate tickets, reduce troubleshooting and automation using AI

Ideation ๐Ÿ”—

Solution Methodology ๐Ÿ”—

Results and Conclusion ๐Ÿ”—

I'm Krishna

Software Engineer