Site Reliability Engineer Intern job at Interintel Technologies Limited
Website :
23 Days Ago
Linkedid Twitter Share on facebook
Site Reliability Engineer Intern
2026-03-03T12:18:29+00:00
Interintel Technologies Limited
https://cdn.greatkenyanjobs.com/jsjobsdata/data/employer/comp_4949/logo/InterIntel%20Technologies%20Limited.png
INTERN
Nairobi
Nairobi
00100
Kenya
Information Technology
Computer & IT, Science & Engineering
KES
MONTH
2026-03-09T17:00:00+00:00
8

About the Company

We are a team of passionate individuals who aspire to fuse the future and present I.T. Based challenges by offering cutting-edge software design and development, infrastructure, mobile-commerce solutions and go to market services to our clients. Our key strength lies in the ability to build innovative systems that can be easily integrated with client network...

Job Summary

The Site Reliability Engineer intern will support in applying software engineering principles to IT operations to ensure the company's platforms are reliable, scalable, observable, and efficient. Their role focuses on automation, monitoring, incident management, infrastructure as code, and measurable reliability targets (SLIS/SLOs) to guarantee high availability and performance across all products.

Duties and Responsibilities

  • Assist in design, implement, and continuously improve system reliability, availability, and performance by assisting in defining and monitoring SLIS, SLOS, and error budgets across all assigned platforms.
  • Support in building and managing a robust monitoring and observability framework using Prometheus, Grafana, and Loki to track latency, traffic, errors, system health, and user impact.
  • Assist in automating infrastructure provisioning, scaling, and configuration management using Infrastructure as Code principles with Terraform and Kubernetes to ensure consistency, scalability, and disaster recovery readiness.
  • Participate in incident response processes, including detection, escalation, resolution, communication, and conducting blameless postmortems to prevent recurrence.
  • Assist in reduce manual operational workload through automation, scripting, and process optimization to improve efficiency and release velocity.
  • Support in ensuring high availability and performance of business- critical systems.
  • Collaborate with Engineering, Product, and DevOps teams to assist in improving deployment safety, capacity planning, cost optimization, and system scalability.
  • Support in ensuring high availability and performance of business- critical systems.
  • Assist in establishing alerting strategies and reliability standards that minimize alert fatigue while ensuring rapid detection and resolution of production issues.

Required Knowledge, Qualification and Experience

  • Bachelor's Degree in Computer Science, Information Technology, or a related field.
  • Some exposure in Kubernetes and Cloud networking.
  • some experience with monitoring and observability tools.
  • Good exposure managing production systems in cloud environments.
  • Some exposure in implementing and managing CI/CD pipelines and utilizing tools like Jenkins, GitLab CI/CD, or equivalent.
  • Some exposure with cloud platforms (AWS, Azure, Google Cloud) and containerization tools like Docker and Kubernetes.
  • Basic hands-on exposure to monitoring and metrics systems such as Prometheus.
  • Basic familiarity with dashboarding and visualization tools such as Grafana. Foundational understanding of log aggregation systems such as Loki.
  • Familiarity with Linux environments and basic system commands. Exposure to scripting concepts using Python, Bash, or similar languages
  • Foundational knowledge of Artificial Intelligence (AI) and good exposure with Al agents; relevant certifications in Al or related disciplines will be an added advantage.
  • Assist in design, implement, and continuously improve system reliability, availability, and performance by assisting in defining and monitoring SLIS, SLOS, and error budgets across all assigned platforms.
  • Support in building and managing a robust monitoring and observability framework using Prometheus, Grafana, and Loki to track latency, traffic, errors, system health, and user impact.
  • Assist in automating infrastructure provisioning, scaling, and configuration management using Infrastructure as Code principles with Terraform and Kubernetes to ensure consistency, scalability, and disaster recovery readiness.
  • Participate in incident response processes, including detection, escalation, resolution, communication, and conducting blameless postmortems to prevent recurrence.
  • Assist in reduce manual operational workload through automation, scripting, and process optimization to improve efficiency and release velocity.
  • Support in ensuring high availability and performance of business- critical systems.
  • Collaborate with Engineering, Product, and DevOps teams to assist in improving deployment safety, capacity planning, cost optimization, and system scalability.
  • Support in ensuring high availability and performance of business- critical systems.
  • Assist in establishing alerting strategies and reliability standards that minimize alert fatigue while ensuring rapid detection and resolution of production issues.
  • Kubernetes
  • Cloud networking
  • Monitoring and observability tools
  • Production systems management in cloud environments
  • CI/CD pipelines
  • Terraform
  • Docker
  • Prometheus
  • Grafana
  • Loki
  • Linux environments
  • Python scripting
  • Bash scripting
  • Artificial Intelligence (AI)
  • AI agents
  • Bachelor's Degree in Computer Science, Information Technology, or a related field.
  • Some exposure in Kubernetes and Cloud networking.
  • Some experience with monitoring and observability tools.
  • Good exposure managing production systems in cloud environments.
  • Some exposure in implementing and managing CI/CD pipelines and utilizing tools like Jenkins, GitLab CI/CD, or equivalent.
  • Some exposure with cloud platforms (AWS, Azure, Google Cloud) and containerization tools like Docker and Kubernetes.
  • Basic hands-on exposure to monitoring and metrics systems such as Prometheus.
  • Basic familiarity with dashboarding and visualization tools such as Grafana.
  • Foundational understanding of log aggregation systems such as Loki.
  • Familiarity with Linux environments and basic system commands.
  • Exposure to scripting concepts using Python, Bash, or similar languages.
  • Foundational knowledge of Artificial Intelligence (AI) and good exposure with AI agents.
  • Relevant certifications in AI or related disciplines will be an added advantage.
bachelor degree
No Requirements
JOB-69a6d195d7f4e

Vacancy title:
Site Reliability Engineer Intern

[Type: INTERN, Industry: Information Technology, Category: Computer & IT, Science & Engineering]

Jobs at:
Interintel Technologies Limited

Deadline of this Job:
Monday, March 9 2026

Duty Station:
Nairobi | Nairobi

Summary
Date Posted: Tuesday, March 3 2026, Base Salary: Not Disclosed

Similar Jobs in Kenya
Learn more about Interintel Technologies Limited
Interintel Technologies Limited jobs in Kenya

JOB DETAILS:

About the Company

We are a team of passionate individuals who aspire to fuse the future and present I.T. Based challenges by offering cutting-edge software design and development, infrastructure, mobile-commerce solutions and go to market services to our clients. Our key strength lies in the ability to build innovative systems that can be easily integrated with client network...

Job Summary

The Site Reliability Engineer intern will support in applying software engineering principles to IT operations to ensure the company's platforms are reliable, scalable, observable, and efficient. Their role focuses on automation, monitoring, incident management, infrastructure as code, and measurable reliability targets (SLIS/SLOs) to guarantee high availability and performance across all products.

Duties and Responsibilities

  • Assist in design, implement, and continuously improve system reliability, availability, and performance by assisting in defining and monitoring SLIS, SLOS, and error budgets across all assigned platforms.
  • Support in building and managing a robust monitoring and observability framework using Prometheus, Grafana, and Loki to track latency, traffic, errors, system health, and user impact.
  • Assist in automating infrastructure provisioning, scaling, and configuration management using Infrastructure as Code principles with Terraform and Kubernetes to ensure consistency, scalability, and disaster recovery readiness.
  • Participate in incident response processes, including detection, escalation, resolution, communication, and conducting blameless postmortems to prevent recurrence.
  • Assist in reduce manual operational workload through automation, scripting, and process optimization to improve efficiency and release velocity.
  • Support in ensuring high availability and performance of business- critical systems.
  • Collaborate with Engineering, Product, and DevOps teams to assist in improving deployment safety, capacity planning, cost optimization, and system scalability.
  • Support in ensuring high availability and performance of business- critical systems.
  • Assist in establishing alerting strategies and reliability standards that minimize alert fatigue while ensuring rapid detection and resolution of production issues.

Required Knowledge, Qualification and Experience

  • Bachelor's Degree in Computer Science, Information Technology, or a related field.
  • Some exposure in Kubernetes and Cloud networking.
  • some experience with monitoring and observability tools.
  • Good exposure managing production systems in cloud environments.
  • Some exposure in implementing and managing CI/CD pipelines and utilizing tools like Jenkins, GitLab CI/CD, or equivalent.
  • Some exposure with cloud platforms (AWS, Azure, Google Cloud) and containerization tools like Docker and Kubernetes.
  • Basic hands-on exposure to monitoring and metrics systems such as Prometheus.
  • Basic familiarity with dashboarding and visualization tools such as Grafana. Foundational understanding of log aggregation systems such as Loki.
  • Familiarity with Linux environments and basic system commands. Exposure to scripting concepts using Python, Bash, or similar languages
  • Foundational knowledge of Artificial Intelligence (AI) and good exposure with Al agents; relevant certifications in Al or related disciplines will be an added advantage.

Work Hours: 8

Experience: No Requirements

Level of Education: bachelor degree

Job application procedure
Interested in applying for this job? Click here to submit your application now.

Send resume and portfolio with subject SITE RELIABITY ENGINEER INTERN

Submission deadline: 9th March 2026

All Jobs | QUICK ALERT SUBSCRIPTION

Job Info
Job Category: Internships/ Trainee jobs in Kenya
Job Type: Full-time
Deadline of this Job: Monday, March 9 2026
Duty Station: Nairobi | Nairobi
Posted: 03-03-2026
No of Jobs: 1
Start Publishing: 03-03-2026
Stop Publishing (Put date of 2030): 10-10-2076
Apply Now
Notification Board

Join a Focused Community on job search to uncover both advertised and non-advertised jobs that you may not be aware of. A jobs WhatsApp Group Community can ensure that you know the opportunities happening around you and a jobs Facebook Group Community provides an opportunity to discuss with employers who need to fill urgent position. Click the links to join. You can view previously sent Email Alerts here incase you missed them and Subscribe so that you never miss out.

Caution: Never Pay Money in a Recruitment Process.

Some smart scams can trick you into paying for Psychometric Tests.