Condé Nast International publishes close to 100 high-quality websites and apps in 11 countries around the world, reaching more than 200 million unique users. Some of our world famous brands are Vogue, GQ, Wired, Vanity Fair, Glamour, Condé Nast Traveller and Architectural Digest.
CNI Digital is running a major digital business transformation, encompassing a CMS platform migration, transformation of our enterprise technology and systems, and a long-term commitment to investing into our digital capabilities. Digital Technology is a key part of CNI’s future, and key to the success of this transformation.
Historically, we've had different tech stacks in markets all over the world. Now we're looking at unifying our core platform. Including our CMS and data infrastructure. This is a new international team, with a hub in London, that has the opportunity to define our architecture, tooling, what we ship and how we ship it. You'll get a rare look at digital publishing around the world.
Our Site Reliability Engineers love being in the centre of all the action and play a critical role in making sure our technology stack is fit for purpose, performing optimally with zero down time.
In this high priority role you will tackle a range of complex application and system support issues such as our CMS system Copilot, Digital Asset Management, Salesforce and a range of other technology platforms, including monitoring of AWS and AZURE infrastructure in multi geographic locations, responding to and safeguarding the availability and reliability of our most popular services.
You will encourage the engagement and dialog between operations and engineering teams, to ensure knowledge is learned, documented and shared with the wider CNI community.
Because our business is online 24/7, you may be required to work out of core hours and also provide on-call duty on a rota basis.
This position is required to work a 24x7x365 rota that may include working outside of core hours and public holidays. During the formation of the team which is expected to take up to 12 months this rota is likely to go through changes before settling on a standard rota.
Key Duties &
Contribution and active involvement with every aspect of our technology environment to include:
- Troubleshoot issues looking at every level of the technology stack and escalate to 3rd level engineering teams when appropriate.
- Carry out post-mortems of incidents, determining the root cause analysis, to ensure they do not reoccur
- Communicate any service degradation and outages accordingly and across markets following our incident management process
- Report clearly by use of graphics/graphs on usage and monitored event trends
- Support and participate in the regression and UAT testing of products and interfaces to other applications
- Work with third-party providers to support a variety of integrations and services
- Work closely with product and engineering teams to deliver new systems and software into operations
Capacity and Performance Management
- Plan and perform ongoing routine application maintenance tasks
- Assist in establishing requirements, methods and procedures for routine maintenance
- Assess application behaviour, proactively using monitoring tool(s) to ensure all systems, services, application events and data is captured
- Monitor and operate cloud infrastructure in multiple geographic locations around the world
- Focus on automation with scalable, elegant and maintainable solutions
- Proactively monitor and plan to prevent out-of-capacity situations
- Undertake performance and availability monitoring, tuning, and reporting
- Create structure and improve all documentation required to support our end users and Technology Services function using CNI Digital’s Wiki
- Mentor and share knowledge with other team members
- There may be a requirement to train-the-trainer, both within the team and in the markets
Essential Skills & Requirements
- Hands-on CMS experience including platform support
- Experience with Adobe Experience Manager 6.x and above
- Web application architecture knowledge and experience with inspection tools, ability to use and identify issues
- Event/log management experience, with the ability to identify patterns and validate against other information such as monitoring graphs/tools
- System administration of platforms such as Gsuite, Okta, Slack, Salesforce, including configuration and troubleshooting
- Solid experience administering, configuring and troubleshooting Linux operating systems
- Experience with at least one scripting language such as Bash, Powershell, Python, Perl and some skill with another, even if it’s one we don’t use
- Knowledge of containerisation (Docker) and the ability to comprehend code
- Experience administering cloud infrastructure, configuration management and associated networking including NLB’s and CDN’s in a production environment
- Hands on experience with monitoring and graphing solutions such as Nagios,CloudWatch, Pingdom, DataDog, Splunk
- Strong understanding of common Internet protocols such as SMTP, DNS, HTTP, SSH, SNMP etc
- Understanding of techniques for management of encryption keys and certificates
- Experience administering databases such as MySQL, Postgres or MongoDB
- High level of written and verbal communication skills
- Ensure adherence to implemented processes
- High level of attention to detail
- Ability to maintain security and confidentiality over sensitive information
- Able to work proactively, with team members, vendors and other staff to achieve goals
- Team player with the ability to work autonomously
- Experience of working with software engineers and web developers
- Able to engage people on all levels
Preferred additional skills
- Second language such as Mandarin, French or Spanish is highly advantageous
- Experience of working with different countries. Awareness of different cultures
- Experience of working in an ITIL V3 environment, using best practice
- Degree or other formal IT qualification with experience in a business environment