Requisition ID:  2663

Senior Technical Manager, Problem Management

Who are we?

We are the IT Division of HKJC, a vibrant community of over 1,500 dedicated professionals working collaboratively across Hong Kong and Shenzhen.

Our team is a diverse mix of individuals from various backgrounds, from all across the world. We embrace our humanity, recognizing that each of us brings unique strengths and perspectives. This diversity not only enriches our work environment but also drives our innovation and creativity as we strive to achieve our collective goals.

What do we do?

We design, build, and operate the technology that powers the Club. Our primary focus is on delivering the service that supports our hospitality, racing and wagering operations, to ensure that our customers and members enjoy exceptional experiences.

We also deliver the changes necessary to drive business growth through new products and services. And, we are committed to safeguarding the Club by protecting it from external threats, providing a secure and resilient technological environment.

The Department

The IT Infrastructure and Platform Operations Department is responsible for the design, implementation, and management of the infrastructure that supports the Club’s IT systems, and leads the Service Management capabilities that ensure the smooth running of these systems.

This department ensures that all technological resources operate efficiently and effectively to support business objectives. Key responsibilities include:

  • Design and operate processes and controls that ensure IT service availability, performance, and resilience are aligned with business expectations.
  • Manage the 24x7 IT Operations Centre.
  • Manage the Club’s exploitation of the public cloud.
  • Manage the complete lifecycle of the Club’s IT network and the technology within our data centres.
  • Provide the roadmaps, standards, and capabilities that enable our IT infrastructure to remain current (eligible for vendor support) and secure (patched and remediated against CVEs).
  • Provide the Club’s colleague collaboration technology suite, including desktop and laptop computers, mobile devices, collaboration tools, carrier contracts, and associated support functions.

The Job

You will:

Problem Identification and Root Cause Analysis:

Lead discussions with technical teams to gather data on incident trends, hardware/software failures, and resource use. Analyse incident records to identify patterns and potential problems. Conduct thorough investigations using root cause techniques like 5 Whys, Fishbone Diagram, and Fault Tree Analysis. Employ data analytics and AIOps tools to detect anomalies and recurring issues relevant to the Club's IT service demands. Communicate documented findings to stakeholders. Consider service management’s four dimensions: People, Process, Technology, and Supplier during analysis.

Problem Control:

Consider all contributory causes, including factors affecting incident duration and impact. Identify and document workarounds with relevant team members, ensuring clear symptom definitions. Conduct error control to find potential permanent solutions and regularly reassess unresolved errors based on customer impact, solution availability/cost, and workaround effectiveness.

Collaboration:

Work closely with SMEs, developers, and stakeholders for seamless problem resolution. Facilitate inter-team communication for unified management approaches. Establish effective meeting rhythms with clear agendas, action items, and delivery timelines. Engage external vendors/service providers as needed, maintaining open, timely communication. Collaborate with incident managers, recognising complementary but sometimes conflicting processes. Interface with risk, change, knowledge management, and continual improvement teams.

Incident Washup Calls:

Prepare and moderate washup calls post local horse racing events. Ensure communication and coordination to identify/address issues. Set urgency, drive troubleshooting, and facilitate root cause/impact discussions. Document follow-up actions—further analysis, emergency fixes, preventive measures—and track assignments to completion. Develop and implement remediation plans collaboratively, using configuration changes, software releases, or infrastructure enhancements. Summarise key findings, decisions, and next steps clearly for senior management.

Training:

Provide comprehensive problem management training for IT support and developers, including detailed guides and online resources. Conduct workshops and sessions to improve skills in root cause analysis, data analytics, AI, and machine learning. Evaluate training effectiveness via feedback, assessments, and metrics. Continuously update materials and methods to maintain relevance.

Reporting:

Prepare and deliver regular reports on problem management activities covering trends, root causes, and fixes. Track KPIs to measure process effectiveness. Create dashboards and visualisations for clear insights. Keep senior management and stakeholders informed of current issues and resolutions.

Continuous Improvement:

Continuously enhance problem management processes for better service quality and efficiency. Stay current with industry trends and best practices. Conduct regular reviews/audits to find improvement areas. Refine processes incorporating feedback and lessons learned. Enhance related processes (Change, Incident, Knowledge management) to uplift overall service management capabilities. Leverage BMC Helix platform advancements and AI features to improve ITSM/ITOM by simplifying, automating, and aligning with industry standards for reliability and efficiency. Implement roadmap for platform migration, building service models, connecting critical business activities to configuration items, and enhancing monitoring. Use AI-driven insights and predictive management to accelerate MTTD, MTTR, and improve service reliability and operational efficiency.

About You

You should have:

  • Degree or above qualifications in Computer Science, Engineering or relevant disciplines
  • Minimum 15 years of work experience in an IT environment, with 8 or more years of experience in project management of medium to large-scale IT Infrastructure projects
  • Track record of relevant experience in IT infrastructure/operations implementation projects
  • Strong technical knowledge and experience in IT service management, incident management, and problem management. Excellent analytical and problem-solving skills to identify root causes and develop effective solutions.
  • Strong verbal and written communication skills to effectively collaborate with IT teams, business users, and stakeholders. Ability to manage multiple projects and tasks simultaneously, ensuring deadlines are met and objectives are achieved.
  • ITIL Foundation certification is required; advanced ITIL certifications are a plus. Proven track record in managing and resolving complex IT issues.
  • Experience with AI and machine learning applications in ITSM, including predictive analytics and automated remediation.
  • Familiarity with the latest BMC Helix platform and its capabilities, including ServiceOps, AIOps, and ITOM technologies. Ability to drive the adoption of these technologies to improve service management processes and outcomes

Terms of Employment

The level of appointment will be commensurate with qualification and experience.

How to Apply

Please send your resume, complete with expected salary and job reference by clicking the Apply Now button or to:

Fax: 2966-5770

Mail: The Human Resources Department, The Hong Kong Jockey Club, 1 Sports Road, Happy Valley, Hong Kong
 

We are an equal opportunity employer. Personal data provided by job applicants will be used strictly in accordance with the Club's notice to employees and prospective employees relating to the Personal Data (Privacy) Ordinance. A copy of which will be provided immediately upon request

Share Page
Share this Job :

To share this job on WeChat, please click the button below to copy the link: