A Day In the Life of an SRE at Pricefx
At Pricefx, Site Reliability Engineers (SREs) are seasoned problem-solvers. Using a range of platforms and tools, they manage various customer issues, ensuring smooth service delivery. To give you an inside look at a day in the life of an SRE, we spoke with our colleague Esther, who shared insights on her daily tasks, favorite tools, team culture, and advice for newcomers.
I’d say every day is different – that's one of the unique parts of being an SRE here. Because Pricefx is a SaaS company that supports a wide range of customers, our infrastructure spans across cloud environments (AWS and GCP) and bare metal instances. This means that on any given day, I could be working on anything from triaging resource contention observed on a cluster running in AWS, to troubleshooting disk utilization on one of our older bare metal partitions.
We also support planned activities like upgrades, migrations, or partition copies, and we handle dynamic or unplanned work like incidents, customer escalations, or internal tool improvements.
Our work is often triggered by tickets generated through the Salesforce platform. A substantial portion includes platform upgrades, migrations, or partition copies, which involve cloning customer environments while ensuring data integrity and application consistency. These operations can happen on both bare metal infrastructure and Nextgen environments; so, the tooling and performance consideration varies.
We also handle more issues which range from analyzing logs from distributed services to optimizing storage in bare metal step-up. And when there’s an infrastructure alert — like CPU pressure or disk saturation - we respond quickly to prevent impact on customer SLAs.
Yes, that is correct, especially when it's related to infrastructure.
Yes, for critical or high-impact issues that require immediate attention, the support team escalates them to us for resolution. In addition, customers can raise tickets for updates through the salesforce platform, and we can proactively reach out to them when it's time for an upgrade.
When a ticket is assigned to you, you own the resolution — but that doesn’t mean you're alone. I’ve genuinely come to appreciate how collaborative the SRE team is at Pricefx. We’re constantly drawing from each other’s experiences. The team has deep expertise across both AWS and legacy systems, and leadership is incredibly approachable.
That’s right. While I take ownership of the issue, I can always rely on my team for support. Technology evolves quickly, so having a collaborative environment really makes a difference when tackling complex challenges.
I enjoy working across complex systems — especially when I get to resolve an issue that spans multiple layers. The exposure to both traditional and cloud-native architecture gives me a broader skill set than I'd get in a cloud-only setup.
The biggest challenge is also what makes the role exciting — the constantly evolving technology landscape. Supporting Next Gen means I need to stay current on changes in Kubernetes, observability stacks, and cloud operations, while still being fluent in managing legacy workloads. It keeps me learning every day.
Definitely, we have access to Udemy, which offers a lot of technical deep-dives — I’ve done courses on Postgres, Linux, and Kubernetes. Our Confluence documentation is also well-maintained, with architecture diagrams, runbooks, and additional documentation for reference.
I also have regular 1:1s with my manager where we review areas for growth or recap interesting incidents and how they were resolved. The culture encourages staying curious.
For me, Grafana is top of the list because it gives us actionable insights. It gives us observability across both cloud-native workloads and bare metal partitions. Whether it's CPU saturation, request latency, or service uptime.
I’m also a big fan of Git. Being able to track changes in Terraform modules, or Kubernetes manifests whilst running an upgrade gives us a strong audit trail and rollback safety net. Another tool in regular use is internal CLI scripts for deployment automation.
It can be, but with the right tooling and understanding, it’s manageable. Once you understand the underlying architecture — whether it’s Kubernetes or a legacy node in bare metal — you know where to look. That said, context switching between platforms requires discipline and good documentation.
Don’t be afraid to ask questions. The infrastructure here is diverse — we have legacy and cloud-native platforms coexisting — and it takes time to learn it all. In my first few weeks, I had a lot of support from peers and leaders. That foundation gave me the confidence to troubleshoot production-impacting issues within a short time.
Also, be proactive. Dig into worklogs, follow the observability patterns, and learn both the old and the new stacks. It’s a great place to grow if you’re hungry to learn.
Absolutely—never hesitate to ask questions or seek guidance. The team fosters a highly supportive environment where collaboration and knowledge sharing are encouraged.
Pricefx
We’re passionate about pricing and that’s why we’re committed to delivering the best-in-class pricing software to boost your profitability and increase your market share.