2019-01-20 16:08:28 +03:00
# Site Reliability Engineer (SRE) Interview Preparation Guide
This repository is an attempt to consolidate useful resources for Site Reliability Engineer (SRE) interview preparation.
## Basics
2021-04-30 14:04:41 +03:00
- [ ] Simple: [What happens when you type in ‘ www.cnn.com’ in your browser? ](https://syedali.net/2013/08/18/what-happens-when-you-type-in-www-cnn-com-in-your-browser )
- [ ] Detailed: [What happens when you type google.com into your browser's address box and press enter? ](https://github.com/alex/what-happens-when )
2019-01-20 16:08:28 +03:00
## Linux
### Boot Process
2021-04-30 14:04:41 +03:00
- [ ] [An introduction to the Linux boot and startup processes ](https://opensource.com/article/17/2/linux-boot-and-startup )
- [ ] [What happens when we turn on computer? ](https://www.cdn.geeksforgeeks.org/what-happens-when-we-turn-on-computer )
- [ ] [What happens when we turn on computer? ](https://leetcode.com/discuss/interview-question/125107/What-happens-when-we-turn-on-computer )
- [ ] [From Power up to login prompt ](http://www.scott-a-s.com/files/linux_boot.pdf )
2019-01-20 16:08:28 +03:00
2019-01-27 09:45:33 +03:00
### Filesystem
2019-01-20 16:08:28 +03:00
2021-04-30 14:04:41 +03:00
- [ ] [Understanding Inodes ](https://syedali.net/2015/02/08/understanding-inodes )
- [ ] [Understand UNIX / Linux Inodes Basics with Examples ](https://www.thegeekstuff.com/2012/01/linux-inodes )
- [ ] [Understanding proc filesystem ](https://syedali.net/2013/08/20/understanding-proc-filesystem )
- [ ] [Common Mount Options ](https://syedali.net/2015/01/06/common-mount-options )
- [ ] [Understanding Linux filesystems: ext4 and beyond ](https://opensource.com/article/18/4/ext4-filesystem )
2019-01-20 16:08:28 +03:00
### Kernel
2021-04-30 14:04:41 +03:00
- [ ] [Explain the basics of Linux kernel ](http://learnlinuxconcepts.blogspot.com/2014/03/explain-basics-of-linux-kernel.html )
- [ ] [Kernel Space and User Space ](http://learnlinuxconcepts.blogspot.com/2014/02/kernel-space-and-user-space.html )
- [ ] [Linux Kernel Process Management ](http://learnlinuxconcepts.blogspot.com/2014/03/process-management.html )
- [ ] [Linux Addressing ](http://learnlinuxconcepts.blogspot.com/2014/02/linux-addressing.html )
- [ ] [Linux Kernel Memory Management ](http://learnlinuxconcepts.blogspot.com/2014/02/linux-memory-management.html )
- [ ] [STACK AND HEAP ](http://learnlinuxconcepts.blogspot.com/2014/02/stack-and-heap.html )
- [ ] [Paging and Segmentation ](http://learnlinuxconcepts.blogspot.com/2014/02/paging-and-segmentation.html )
- [ ] [Linux Kernel System Calls ](http://learnlinuxconcepts.blogspot.com/2014/02/system-calls.html )
- [ ] [The Virtual Filesystem ](http://learnlinuxconcepts.blogspot.com/2014/10/the-virtual-filesystem.html )
- [ ] [Concurrency and Race Conditions ](http://learnlinuxconcepts.blogspot.com/2014/07/concurrency-and-race-conditions.html )
- [ ] [Memory Leak ](https://stackoverflow.com/questions/312069/the-best-memory-leak-definition )
- [ ] [What is a kernel Panic? ](http://learnlinuxconcepts.blogspot.com/2014/07/what-is-kernel-panic.html )
2019-01-20 16:08:28 +03:00
### Troubleshooting
2021-04-30 14:04:41 +03:00
- [ ] [Linux troubleshooting tools ](https://syedali.net/2013/08/20/linux-troubleshooting-tools )
- [ ] [Linux Performance Analysis in 60,000 Milliseconds ](https://medium.com/netflix-techblog/linux-performance-analysis-in-60-000-milliseconds-accc10403c55 )
2022-10-14 22:49:53 +03:00
- [ ] [strace ](https://www.dedoimedo.com/computers/strace.html )
- [ ] [lsof ](https://www.dedoimedo.com/computers/lsof.html )
- [ ] [Linux system debugging ](https://www.dedoimedo.com/computers/linux-system-debugging-super.html )
2022-10-17 07:43:59 +03:00
- [ ] [SaaS where users can test their Linux troubleshooting skills ](https://sadservers.com )
2019-01-20 16:08:28 +03:00
## Networking
2021-04-30 14:04:41 +03:00
- [ ] [Network protocols for anyone who knows a programming language ](https://www.destroyallsoftware.com/compendium/network-protocols?share_key=97d3ba4c24d21147 )
- [ ] [Introduction to Linux interfaces for virtual networking ](https://developers.redhat.com/blog/2018/10/22/introduction-to-linux-interfaces-for-virtual-networking )
- [ ] [Multi-tier load-balancing with Linux ](https://vincent.bernat.ch/en/blog/2018-multi-tier-loadbalancer )
- [ ] [Introduction to modern network load balancing and proxying ](https://blog.envoyproxy.io/introduction-to-modern-network-load-balancing-and-proxying-a57f6ff80236 )
- [ ] [Load Balancing Algorithms ](https://syedali.net/2013/08/22/load-balancing-algorithms )
2019-01-20 16:08:28 +03:00
2019-01-21 04:37:31 +03:00
## Containers
2021-04-30 14:04:41 +03:00
- [ ] [Introduction to Docker and Containers ](http://container.training/intro-selfpaced.yml.html )
- [ ] [Containers Patterns ](https://l0rd.github.io/containerspatterns )
- [ ] [Docker Container Anti Patterns ](https://blog.couchbase.com/docker-container-anti-patterns/ )
2021-12-03 13:59:17 +03:00
- [ ] [Anti-Patterns When Building Container Images ](https://jpetazzo.github.io/2021/11/30/docker-build-container-images-antipatterns )
2019-01-27 07:57:56 +03:00
2019-04-07 18:02:06 +03:00
## Kubernetes
2021-04-30 14:04:41 +03:00
- [ ] [Deploying and Scaling Microservices with Docker and Kubernetes ](http://container.training/kube-selfpaced.yml.html )
2022-07-13 03:13:36 +03:00
- [ ] [Demystifying the Kubernetes Iceberg ](https://asankov.dev/blog/2022/05/15/demystifying-the-kubernetes-iceberg-part-1 )
2021-04-30 14:04:41 +03:00
- [ ] [What happens when ... Kubernetes edition! ](https://github.com/jamiehannaford/what-happens-when-k8s/blob/master/README.md )
- [ ] [Kubernetes Production Patterns ](https://github.com/gravitational/workshop/blob/master/k8sprod.md )
- [ ] [Kubernetes production best practices ](https://learnk8s.io/production-best-practices )
- [ ] [A Guide to the Kubernetes Networking Model ](https://sookocheff.com/post/kubernetes/understanding-kubernetes-networking-model )
- [ ] [47 Things To Become a Kubernetes Expert ](https://ymmt2005.hatenablog.com/entry/k8s-things )
2021-10-10 17:42:36 +03:00
- [ ] [Kubernetes Best Practices 101 ](https://github.com/diegolnasc/kubernetes-best-practices )
2022-11-09 13:21:00 +03:00
- [ ] [15 Kubernetes Best Practices Every Developer Should Know ](https://spacelift.io/blog/kubernetes-best-practices )
2019-04-07 18:02:06 +03:00
2019-01-27 09:45:33 +03:00
## Infrastructure as code / Configuration management
2021-04-30 14:04:41 +03:00
- [ ] [Terraform ](https://learn.hashicorp.com/terraform )
2022-10-22 08:49:06 +03:00
- [ ] [A Comprehensive Guide to Terraform ](https://blog.gruntwork.io/a-comprehensive-guide-to-terraform-b3d32832baca )
2021-04-30 14:04:41 +03:00
- [ ] [Ansible ](https://github.com/leucos/ansible-tuto )
2022-11-09 13:21:00 +03:00
- [ ] [Getting Started With Terraform on AWS ](https://spacelift.io/blog/terraform-tutorial )
2019-01-27 09:45:33 +03:00
2022-01-05 19:55:17 +03:00
## Databases
2022-07-01 15:29:04 +03:00
- [ ] [Things You Should Know About Databases ](https://architecturenotes.co/things-you-should-know-about-databases )
2022-06-05 17:25:03 +03:00
- [ ] [7 Database Paradigms ](https://youtu.be/W2Z7fbCLSTw )
2022-06-05 17:18:51 +03:00
- [ ] [CAP theorem ](https://en.wikipedia.org/wiki/CAP_theorem )
- [ ] [Evolutionary Database Design ](https://martinfowler.com/articles/evodb.html )
- [ ] [ACID vs BASE in Databases ](https://medium.com/geekculture/acid-vs-base-in-databases-1bcad774da26 )
2022-01-05 19:55:17 +03:00
- [ ] [Understanding Database Sharding ](https://www.digitalocean.com/community/tutorials/understanding-database-sharding )
2022-09-10 00:56:20 +03:00
- [ ] [Database Replication ](https://galeracluster.com/library/documentation/tech-desc-introduction.html#database-replication )
2022-06-05 17:21:35 +03:00
- [ ] [SQL vs. NoSQL Database: When to Use, How to Choose ](https://towardsdatascience.com/datastore-choices-sql-vs-nosql-database-ebec24d56106 )
2022-09-10 00:35:24 +03:00
- [ ] [How do database indexes work? ](https://planetscale.com/blog/how-do-database-indexes-work )
2022-01-05 19:55:17 +03:00
2019-01-27 07:57:56 +03:00
## CI/CD
2021-04-30 14:04:41 +03:00
- [ ] [7 Pipeline Design Patterns for Continuous Delivery ](https://www.singlestoneconsulting.com/blog/7-pipeline-design-patterns-for-continuous-delivery )
- [ ] [CI/CD patterns ](https://continuousdelivery.com/implementing/patterns )
- [ ] [Six Strategies for Application Deployment ](https://thenewstack.io/deployment-strategies )
2019-01-21 04:37:31 +03:00
2019-06-30 04:26:59 +03:00
## Clouds
2021-04-30 14:04:41 +03:00
- [ ] [The Open Guide to Amazon Web Services ](https://github.com/open-guides/og-aws )
- [ ] [Learning Azure ](https://docs.microsoft.com/en-us/learn/azure/ )
- [ ] [Hands-On Training with GCP ](https://cloud.google.com/training/badges )
2019-06-30 04:26:59 +03:00
2019-01-20 16:08:28 +03:00
## Programming
2020-10-30 18:40:37 +03:00
### Python
2021-04-30 14:04:41 +03:00
- [ ] [Python Basics ](https://pythonbasics.org/ )
- [ ] [Python For Everyone ](https://www.py4e.com/ )
2022-07-26 16:39:55 +03:00
- [ ] [Complete Python Tutorial ](https://www.scaler.com/topics/python/ )
2020-10-30 18:40:37 +03:00
2019-01-20 16:08:28 +03:00
### Go (Golang)
2021-04-30 14:04:41 +03:00
- [ ] [A tour of Go ](https://tour.golang.org )
- [ ] [Go by Example ](https://gobyexample.com )
- [ ] [Learn Go with Tests ](https://quii.gitbook.io/learn-go-with-tests/ )
- [ ] [Getting up and running with Go ](http://www.golangprograms.com )
- [ ] [Effective Go ](https://golang.org/doc/effective_go.html )
- [ ] [Go Design Patterns ](https://github.com/tmrts/go-patterns )
- [ ] [Go Memory Management ](https://povilasv.me/go-memory-management )
2019-01-20 16:08:28 +03:00
### Big O Notation, Algorithms and Data Structures
2021-04-30 14:04:41 +03:00
- [ ] [AlgoExperts ](https://www.algoexpert.io )
- [ ] [Hacking a Google Interview – Handout 1 ](http://courses.csail.mit.edu/iap/interview/Hacking_a_Google_Interview_Handout_1.pdf )
- [ ] [Hacking a Google Interview – Handout 2 ](http://courses.csail.mit.edu/iap/interview/Hacking_a_Google_Interview_Handout_2.pdf )
- [ ] [Hacking a Google Interview – Handout 3 ](http://courses.csail.mit.edu/iap/interview/Hacking_a_Google_Interview_Handout_3.pdf )
2019-01-20 16:08:28 +03:00
## System design
2021-04-30 14:04:41 +03:00
- [ ] [SystemsExpert course from AlgoExpert ](https://www.algoexpert.io/se/product )
- [ ] [Grokking the System Design Interview ](https://www.educative.io/collection/5668639101419520/5649050225344512 )
- [ ] [The System Design Primer ](https://github.com/donnemartin/system-design-primer )
- [ ] [Crack the System Design Interview ](https://www.puncsky.com/blog/2016/02/14/crack-the-system-design-interview )
- [ ] [System design interview for IT companies ](https://github.com/checkcheckzz/system-design-interview )
2021-12-03 14:03:20 +03:00
- [ ] [Web Architecture 101 ](https://medium.com/storyblocks-engineering/web-architecture-101-a3224e126947 )
2021-11-07 02:17:40 +03:00
- [ ] [What's in a Production Web Application? ](https://web.archive.org/web/20210106095747/http://stephenmann.io/post/whats-in-a-production-web-application )
- [ ] [Distributed systems ](http://book.mixu.net/distsys/single-page.html )
2019-01-20 16:08:28 +03:00
2022-01-30 00:00:31 +03:00
### System design examples
- [ ] [Designing WhatsApp ](http://highscalability.com/blog/2022/1/3/designing-whatsapp.html )
- [ ] [Designing Uber ](http://highscalability.com/blog/2022/1/25/designing-uber.html )
- [ ] [Designing Tinder ](http://highscalability.com/blog/2022/1/17/designing-tinder.html )
- [ ] [Designing Instagram ](http://highscalability.com/blog/2022/1/11/designing-instagram.html )
- [ ] [Designing Netflix ](http://highscalability.com/blog/2021/12/13/designing-netflix.html )
2019-01-20 16:08:28 +03:00
## Monitoring
2021-04-30 14:04:41 +03:00
- [ ] [SLOs & You: A Guide To Service Level Objectives ](https://www.circonus.com/2018/07/a-guide-to-service-level-objectives )
2021-09-25 02:00:34 +03:00
- [ ] [Setting up Service Monitoring — The Why’ s and What’ s ](https://amitosh.medium.com/the-whys-and-what-s-of-setting-up-service-monitoring-cc1c165ee088 )
2022-07-27 14:28:08 +03:00
- [ ] [How NOT to Measure Latency ](https://youtu.be/lJ8ydIuPFeU )
2019-01-20 16:08:28 +03:00
2019-02-03 19:45:14 +03:00
## Processes
2022-08-08 19:55:53 +03:00
- [ ] [The practical guide to incident management ](https://incident.io/guide )
2021-04-30 14:04:41 +03:00
- [ ] [Incident Response ](https://response.pagerduty.com )
- [ ] [Postmortems ](https://postmortems.pagerduty.com )
2022-08-08 19:57:09 +03:00
- [ ] [Runbooks ](https://www.transposit.com/devops-blog/itsm/what-makes-a-good-runbook )
2021-04-30 14:04:41 +03:00
- [ ] [Identifying and tracking toil using SRE principles ](https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles )
- [ ] [Building SRE from Scratch ](https://medium.com/ibm-garage/building-sre-from-scratch-485e23985bbd )
2021-05-14 03:04:29 +03:00
- [ ] [SRE at Google: Our complete list of CRE life lessons ](https://cloud.google.com/blog/products/devops-sre/sre-at-google-our-complete-list-of-cre-life-lessons )
2021-06-08 04:14:23 +03:00
- [ ] [Incident Management vs. Incident Response - What's the Difference? ](https://rootly.io/blog/incident-management-vs-incident-response-what-s-the-difference )
- [ ] [Practical Guide to SRE: Using SLOs to Increase Reliability ](https://rootly.io/blog/practical-guide-to-sre-using-slos-to-increase-reliability )
- [ ] [Practical Guide to SRE: Automating On-Call ](https://rootly.io/blog/practical-guide-to-sre-automating-on-call )
2021-12-17 22:18:04 +03:00
- [ ] [Going from Zero to SRE ](https://www.squadcast.com/blog/going-from-zero-to-sre )
2022-10-17 07:37:00 +03:00
- [ ] [An Incident Command Training Handbook ](https://blog.danslimmon.com/2019/06/24/an-incident-command-training-handbook )
2019-02-03 19:45:14 +03:00
2021-12-07 05:32:19 +03:00
## Resume
2021-12-07 05:32:45 +03:00
- [ ] [SRE Complete Resume Writing Guide ](https://rootly.com/blog/sre-complete-resume-writing-guide )
2021-12-07 05:32:19 +03:00
2019-01-20 16:08:28 +03:00
## Interview
### SRE interview process
2021-04-30 14:04:41 +03:00
- [ ] [How to hire talent ](https://syedali.net/2014/04/01/how-to-hire-talent )
2022-08-14 23:51:01 +03:00
- [ ] [Recruitment process for a Google job (SRE, Site Reliability Engineer) ](https://web.archive.org/web/20220328124724/http://lambda-startup.com/recruitment-process-for-a-google-job-sre-site-reliability-engineer )
2019-01-20 16:08:28 +03:00
### Interview Questions
2021-04-30 14:04:41 +03:00
- [ ] [A collection of questions to practice with for SRE interviews ](https://github.com/michael-kehoe/sre-interview )
- [ ] [SRE Interview Questions ](https://syedali.net/engineer-interview-questions )
- [ ] [Sysadmin Test Questions ](https://github.com/trimstray/test-your-sysadmin-skills )
- [ ] [Kubernetes job interview questions ](https://enterprisersproject.com/article/2019/2/kubernetes-job-interview-questions-how-prepare )
- [ ] [DevOps Guide ](https://github.com/Tikam02/DevOps-Guide )
- [ ] [Questions I ask in SRE interviews ](https://dev.to/logan/questions-i-ask-in-sre-interviews-a9j )
- [ ] [DevOps Roadmap: Learn to become a DevOps Engineer or SRE ](https://roadmap.sh/devops )
2019-01-20 16:08:28 +03:00
### Blogposts
2021-04-30 14:04:41 +03:00
- [ ] [SRE Interviews in Silicon Valley ](http://blog.marc-seeger.de/2015/05/01/sre-interviews-in-silicon-valley )
- [ ] [Preparing the SRE interview ](https://blog.balthazar-rouberol.com/preparing-the-sre-interview )
- [ ] [How to Get Into SRE ](https://blog.alicegoldfuss.com/how-to-get-into-sre )
- [ ] [My Job Interview at Google ](https://catonmat.net/my-job-interview-at-google )
2021-05-07 15:57:33 +03:00
- [ ] [Path to Site Reliability Management ](https://danrl.com/srm )
2021-12-03 14:05:08 +03:00
- [ ] [Becoming a Site Reliability Engineer ](https://www.tik.dev/blog/becoming-an-sre )
2021-12-06 12:13:37 +03:00
- [ ] [How I get a job at Google as SRE ](https://fabrizio2210.medium.com/how-i-get-a-job-at-google-as-sre-83d44aef7859 )
2019-01-20 16:08:28 +03:00
## Books
### SRE books
2021-04-30 14:04:41 +03:00
- [ ] [Site Reliability Engineering ](https://sre.google/sre-book/table-of-contents )
- [ ] [The Site Reliability Workbook ](https://sre.google/workbook/table-of-contents )
- [ ] [Seeking SRE ](https://books.google.ru/books?id=tmhqDwAAQBAJ )
- [ ] [Building Secure and Reliable Systems ](https://sre.google/books/building-secure-reliable-systems )
- [ ] [Implementing Service Level Objectives ](https://learning.oreilly.com/library/view/implementing-service-level/9781492076803 )
2019-01-20 16:08:28 +03:00
### Linux
2021-04-30 14:04:41 +03:00
- [ ] [Linux Kernel Development (3rd Edition) ](https://www.amazon.com/Linux-Kernel-Development-Robert-Love/dp/0672329468 )
- [ ] [UNIX and Linux System Administration Handbook (5th Edition) ](https://www.amazon.com/UNIX-Linux-System-Administration-Handbook/dp/0134277554 )
- [ ] [Linux Pocket Guide, 3rd Edition ](http://shop.oreilly.com/product/0636920040927.do )
2019-01-20 16:08:28 +03:00
2019-01-23 20:35:14 +03:00
### Networking
2021-04-30 14:04:41 +03:00
- [ ] [TCP/IP Illustrated, Volume 1 ](https://www.amazon.com/TCP-Illustrated-Protocols-Addison-Wesley-Professional/dp/0321336313 )
2019-01-23 20:35:14 +03:00
2020-04-19 18:04:32 +03:00
### Troubleshooting and Performance
2021-04-30 14:04:41 +03:00
- [ ] [Systems Performance: Enterprise and the Cloud ](https://www.amazon.com/Systems-Performance-Enterprise-Brendan-Gregg/dp/0133390098 )
- [ ] [Systems Performance, 2nd Edition ](https://www.informit.com/store/systems-performance-9780136820154?ranMID=24808 )
2020-04-19 18:04:32 +03:00
2019-01-20 16:08:28 +03:00
## Courses
2021-04-30 14:04:41 +03:00
- [ ] [Site Reliability Engineering: Measuring and Managing Reliability ](https://www.coursera.org/learn/site-reliability-engineering-slos )
- [ ] [School of SRE ](https://linkedin.github.io/school-of-sre )
2020-12-10 00:17:16 +03:00