BlackLine

Senior Site Reliability Engineer

Posted 8 Hours Ago

Be an Early Applicant

Hybrid

Pleasanton, CA

Senior level

Hybrid

Pleasanton, CA

Senior level

The Senior Site Reliability Engineer ensures optimal performance and availability of BlackLine's cloud services by managing capacity planning, technical project execution, and software engineering tasks. Responsibilities include collaborating with teams, addressing customer escalations, identifying and resolving performance issues, and developing automation tools to enhance service reliability and efficiency.

The summary above was generated by AI

Get to Know Us:
It's fun to work in a company where people truly believe in what they're doing!
At BlackLine, we're committed to bringing passion and customer focus to the business of enterprise applications.
Since being founded in 2001, BlackLine has become a leading provider of cloud software that automates and controls the entire financial close process. Our vision is to modernize the finance and accounting function to enable greater operational effectiveness and agility, and we are committed to delivering innovative solutions and services to empower accounting and finance leaders around the world to achieve Modern Finance.
Being a best-in-class SaaS Company, we understand that bringing in new ideas and innovative technology is mission critical. At BlackLine we are always working with new, cutting edge technology that encourages our teams to learn something new and expand their creativity and technical skillset that will accelerate their careers.
Work, Play and Grow at BlackLine!
The successful applicant will be performing work in FedRAMP environments, and therefore, must be a US Citizen.
Make Your Mark:
The FedRamp team plays a critical role in advancing our organization's mission to deliver secure, compliant, and highly reliable cloud solutions. This team is at the forefront of ensuring our systems meet the stringent requirements of the FedRamp program, supporting both our internal needs and those of our customers.
As part of this team, members will have the unique opportunity to design, deploy, build, and support our FedRamp -compliant cloud environment. They'll be instrumental in shaping the future of our infrastructure by applying cutting-edge patterns such as Site Reliability Engineering (SRE), DevOps, and DevSecOps to streamline deployments and enhance overall efficiency.
By embracing these modern approaches, the FedRamp team will not only deliver robust and secure systems but also drive innovation and set a new standard for operational excellence across our cloud platform.
The Senior Site Reliability Engineer (SRE) plays a pivotal role in ensuring that Blackline's services/infrastructure are carefully planned and deployed in a time, place, and configuration which is ideal for serving BlackLine's clients. The SRE role sits at a nexus of capacity planning, technical project execution, product planning, business analysis, site reliability, and software engineering. The Sr Site Reliability Engineer is responsible for assessing, testing, tracking, predicting and reporting all related aspects of a suite of production applications from a scalability, performance, responsiveness, capacity and availability perspective.
You'll Get To:

Develop and maintain subject matter expertise in BlackLine's service and infrastructure architecture, operation, performance characteristics
Act as a primary resource for the Support organization in responding to customer escalations for performance or availability issues
Identify and communicate issues or conditions that currently, or may in the future, prevent BlackLine services and infrastructure from performing as needed to meet customer expectations; act to resolve the issue, including determining the root cause of the issues, facilitating development of a solution to resolve the issue, gathering a cross-functional team as needed
Improve and maintain a continuous metric framework that observes and records and trends real time availability data for all of our clients
Develop and maintain on premise and cloud capacity plans that ensure we are delivering a BlackLine service that is performant and cost effective
Collaborate with development and other technology teams on requirements definition, observability standards, capacity planning, and process refinement
Improve the BlackLine SaaS service experience by discovering and highlighting optimization opportunities with existing code to address application availability, performance, observability, efficiency, and security challenges.
Develop tools and systems to automate the identification, analysis, and remediation of application events, infrastructure issues, or requests.
Establish and maintain Key Performance Indicators for the overall health of the service and build tools to exercise and evaluate if these KPI's are being met.
Work cross-functionally with other teams to surface common pain points, architect solutions, establish conventions, and evangelize application development and operations best practices.
Transform discoveries into requests to others or action items for you and your team.
Regularly learn new systems and tools as the BlackLine platform and ecosystem evolves.
Contribute knowledge, skills, and personal qualities to a dedicated team of top engineers solving real-life problems in a bleeding-edge, high-performance, and high-traffic environment.
Publish performance result findings, conclusions, recommendations
Create second tier level analysis of capacity constraint points and performance and discuss with development teams/infrastructure
Support integration of performance data into customer experience analytics tools and reporting
Ensure application and infrastructure capacity management efforts have verifiable capacity data to support business cases
Monitor industry trends and keep abreast of new tools and technologies.
Participate in our on-call rotation, act as crisis manager/tier 3 technical support for major incidents, and conduct incident reviews
Other duties as assigned

What You'll Bring:

BS in Computer Science or equivalent work experience
Minimum 5+ years of experience with a significant subset of the following technologies: GCP, AWS, Azure, Kubernetes, GCP, AWS, Azure, HTML, CSS, XML, SOAP, Ajax, JavaScript, IIS, MSSQL, MySQL, Go, Jenkins, Chef, PowerShell, WMI, Java, Apache, Tomcat, SSL, Docker
Extensive knowledge of managing cloud platforms and cloud native tools.
Demonstrated expertise with networking and distributed systems.
Capable of participating in, and leading customer-facing performance evaluations and briefings
Intermediate knowledge of at least two of the following programming languages: C#, Visual Basic, PowerShell, Java, Go, Linux Shell, Ruby.
Demonstrated history of developing or operating production web applications and solid understanding of HTTP(S), HTML, JavaScript, CSS, and XML.
Significant experience in a lead role on a software development or operations team.
Intermediate level knowledge of IIS and Windows Server or Linux and Apache. Intermediate level knowledge of Windows and Linux based systems and automating the management of core kernel and systems configurations, experience with Java and Python.
Intermediate level knowledge with configuration management tools.
Experience with container orchestration platforms like Kubernetes.
Intermediate level knowledge deploying and managing observability tools; such as Elastic, Kibana, Prometheus, etc.
Capable of producing clean, readable code in a multi-developer team environment.
Someone energized by a fast-paced, iterative approach.
Eager to learn and soak in new information.
Must maintain the highest level of integrity, courtesy and respect while interacting with internal and external customers, employees and business contacts
Excellent oral and written communication skills
Ability to interface with internal technical experts using professional interpersonal skills
Experience in analyzing datasets to draw conclusions, and graph datasets supporting these conclusions
Exhibit creative problem-solving, logical troubleshooting and analytical skills
Basic level proficiency in application load balancing methods (F5 LTM, Windows NLB, etc.)
Working knowledge of TCP/IP and networking concepts
Proficiency with statistical concepts; confidence interval, hypothesis testing, sampling
Operating systems concepts such as CPU, memory, CPU and disk queues and graphing/analyzing these over time
Must possess strong organizational skills and be able to work with minimal oversight
Ability to understand new technologies quickly and adapt these into daily work and goals

We're Even More Excited If You Have:

Prior C#, ASP.NET, Ruby, Go or Java development experience, preferably in an agile SaaS environment.
Significant experience with open source platforms and technologies.
Experience with software development processes and methodologies.
Track record of architecting, developing, implementing robust, distributed online solutions.

Thrive at BlackLine Because You Are Joining:

A technology-based company with a sense of adventure and a vision for the future. Every door at BlackLine is open. Just bring your brains, your problem-solving skills, and be part of a winning team at the world's most trusted name in Finance Automation!
A culture that is kind, open, and accepting. It's a place where people can embrace what makes them unique, and the mix of cultural backgrounds and varying interests cultivates diverse thought and perspectives.
A culture where BlackLiner's continued growth and learning is empowered. BlackLine offers a wide variety of professional development seminars and inclusive affinity groups to celebrate and support our diversity.

BlackLine is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity or expression, race, ethnicity, age, religious creed, national origin, physical or mental disability, ancestry, color, marital status, sexual orientation, military or veteran status, status as a victim of domestic violence, sexual assault or stalking, medical condition, genetic information, or any other protected class or category recognized by applicable equal employment opportunity or other similar laws.
BlackLine recognizes that the ways we work and the workplace itself has shifted. We innovate in a workplace that optimizes a combination of virtual and in-person interactions to maximize collaboration and nurture our culture. Candidates who live within a reasonable commute to one of our offices will work in the office at least 2 days a week.
Salary Range:
USD $145,000.00 - USD $193,000.00
Pay Transparency Statement:
Placement within this range depends upon several factors, including the applicant's prior relevant job experience, skill set, and geographic location. In addition to base pay, BlackLine also offers short-term and long-term incentive programs, based on eligibility, along with a robust offering of benefit and wellness plans.
Accommodations:
BlackLine is committed to creating an inclusive and accessible experience for all candidates. If you require a reasonable accommodation that would better enable your success during the application or interview process, please complete this form.

Similar Jobs at BlackLine

BlackLine

Staff I Reliability Engineer - FedRAMP

Be an Early Applicant

8 Hours Ago

Pleasanton, CA, USA

Hybrid

1,810 Employees

Senior level

Apply

1,810 Employees

Senior level

Cloud • Fintech • Information Technology • Machine Learning • Software • App development • Generative AI

The Staff Site Reliability Engineer at BlackLine will assess and report on the performance and availability of production applications, create testing frameworks, and develop capacity plans. Responsibilities include optimizing the service experience, automating event identification, establishing KPIs, and mentoring team members. The role requires collaboration across functions and a commitment to continuous learning in a dynamic environment.

BlackLine

Senior Manager, Site Reliability Engineering

Be an Early Applicant

8 Hours Ago

Pleasanton, CA, USA

Hybrid

1,810 Employees

Senior level

Apply

1,810 Employees

Senior level

Cloud • Fintech • Information Technology • Machine Learning • Software • App development • Generative AI

The Senior Manager, Site Reliability Engineering will lead a team overseeing the FedRamp operations and reliability of BlackLine's Multi-Tenant Accounts Receivable SaaS products hosted in Microsoft Azure. Responsibilities include capacity planning, performance monitoring, incident response, and managing day-to-day operations.

BlackLine

Lead Network Engineer

Be an Early Applicant

2 Days Ago

Pleasanton, CA, USA

Hybrid

1,810 Employees

Expert/Leader

Apply

1,810 Employees

Expert/Leader

Cloud • Fintech • Information Technology • Machine Learning • Software • App development • Generative AI

The Lead Network Engineer will oversee network engineering and operational support in a global environment, ensuring compliance with system architecture and security policies, collaborating on infrastructure projects, and maintaining network documentation. Responsibilities include deploying cloud networking solutions, monitoring network performance, and providing escalation support.

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.