In this post, I'll be sharing a recap of my notes from Day 41 to 45 of learning in the Udacity Cloud DevOps Engineer Nanodegree classroom (and it was a lot 😢).
These are compiled from my daily IG posts, as part of my #3MonthsOfCDE series and only slightly edited for platform suitability.
Day 41 - Working on an Entire Blue-Green Deployment Pipeline
I spent all of yesterday working on the exercises in Lesson 4 and once again, I realised just how hard it is to learn things as an absolute beginner.
First, I tested out the “Build Ansible inventory file” step automation (from the previous article) with my personal account and it ran as expected. Afterwards, I did some research on how to run the workflow with the Udacity federated user. It turns out I just had to create an IAM user, like the one I used with my personal account.
It turns out that I could use the user's access key credentials to set up environment variables for CircleCI (instead of the Udacity temporary credentials), and use those variables with the jobs that needed aws-cli.
Afterwards, I continued working on the exercises. How did that go?
Let's just say my CircleCI account with the endless failed pipelines speaks for me because I ran into a lot of different errors and I was quite frustrated. Some of the errors took a long time to spot and fix while others were silly and spelling-related (thanks to my frustration).
By the end of the day, I had worked on three exercises, and only two of them finished successfully.
Day 42 - Troubleshooting Blue-Green Deployment Pipelines
This day turned out much better than Day 41 because:
a) I figured out the issues with the third exercise,
b) I was able to complete the rest of the exercises (with lots of failed attempts and pipelines, of course), and
c) I now have a fully functional running pipeline, plus some extra steps I added to some cleanup jobs for better functionality! 🎉
- It did not come without frustration but I am glad I fully went through Lesson 4 and all that it had to teach me. It almost always makes carrying out the project a less bumpy ride.
Spoiler alert: I faced entirely different issues with the main course project.
- I have also started Lesson 5 on Monitoring Environments and I hope that ends well, so I can get started on my project (and hopefully get it done and approved by weekend). Sayonara!
Another Spoiler: I did not. It is currently still ongoing.
Day 43 - Monitoring Environments With Prometheus
What happens when you get 2 days off a work week and you resume?
Well, I did not spend much time with the last Lesson I started on "Monitoring tools", because there was a backlog of tasks that needed to be completed and pushed to production.
Still, I went through the introductory sections and I am now currently working on the Prometheus setup exercise (Prometheus is the tool of choice for the course).
I also took a look at the Project contents to have an idea on what I would be working on. Things look familiar now, so I am almost ready!
Day 44 + 45 - Working with Prometheus
It’s safe to say I went through IT with this last lesson!
While I understand that there are different ways of doing things, the materials in the lesson were quite confusing. The instructor videos used a setup for the Prometheus server (+ node_exporter and alert managers) that was entirely different from the “tutorial links” provided to replicate.
At first, I chose the setup in the videos because it was faster and required less configuration (plus the tutorial links seemed quite old). However for some reason, I reverted to using the tutorial setups and that wasted my time.
Of course, it could be that I did not set things up correctly (and I am certain I did), but the issues were most likely from the outdated approach. I eventually finished the lessons by reverting back to the first setup in the instructor videos. As always, I encountered errors whilst setting things up but they all worked after the fixes.
There was a particular situation that took more time though; setting up alert managers with different receivers (email and Slack). Something kept failing - the server was reporting the error and with the rules I defined, I was supposed to get an email alert for that particular error. However, the emails were not getting sent.
I am still not completely sure what actually fixed things; I had to debug my
rules.yml
,alertmanager.yml
files AND remember to start up the alert manager🤦🏾♀️, etc. It was most likely a combination of everything I did.The exercises from the lessons are now completed and I started working on the main project.
P.S Dear Udacity, maybe update links in contents often? Or put disclaimers to correct things said in the video that are now outdated?