Challenges in Managing a Captive DC in a Heterogeneous Environment By Ankur Kheterpal, Senior Vice President IT, Info Edge India Limited

Challenges in Managing a Captive DC in a Heterogeneous Environment

Ankur Kheterpal, Senior Vice President IT, Info Edge India Limited | Thursday, 09 June 2016, 08:42 IST

A Data Centre (DC) for an enterprise could be on premise or hosted with any of the service providers in the market place.

DC’s are usually classified on the basis of Tier (I, II, III, IV) which is arrived at considering various factors like
1.Location
2.Seismic zone
3.No. of power sources they feed on
4.Firefighting systems in place and so forth

From a technology/ delivery viewpoint, there is a lot being offered to choose from in the industry like IaaS, PaaS, AaaS and SaaS. If we look at these closely all of these are required for an enterprise, they are being represented in a layered architecture to choose from. Ultimately an organization will require all of these layers to run business applications with performance being the key deliverable with costs being as minimal in context.

The belief could be that in a hosted DC the cost is spread across multiple customers and is low and the one in a captive DC is borne entirely by the enterprise and thus high. The overall expertise required to run the DC Ops would be better managed; but if the same is looked at from a service delivery point of view, the sheer response or TAT’s for any change request in the hosted model, the windows could be between a few hours to a couple of days, which could bring in a huge impact.
In the case of a captive DC; there is stuff that has been exclusively setup for a single entity and if this is well architected, documented and managed, the response time/ TAT could come down to a few minutes or at the most a few hours.

In a captive DC, one would be limited by the number of resources alright, which could in turn mean quality vs. quantity, more accountability and a better focus…

In either cases the service and support contracts with relevant OEM’s must be tailored to perfection else they may go beyond acceptable levels if planned for unrealistic goals.

Both the models would have the same number of maintenance layers with the required processes, software and its automation.

Performance; its monitoring/ visibility and Agility being the key
The IT industry has evolved big time and it has moved from a physical infra setups to virtual platforms which goes a great deal in helping manage heterogeneous environments, if we consider application performance being of paramount importance; we will have to look at all delivery layers right from on the provisioning of compute, memory and last but the most important piece to be factored would be storage and its associated performance.

While resources once provisioned, do deliver as expected, but storage would be the animal which will have to be tamed for an end to end delivery… the KPI’s to be considered would be IOPS, response time, throughput in bandwidth. Teiring of storage would be a great option where hot data moves to faster SSD disks and improves performance dramatically and this when complemented with QoS for demanding apps would be the deal maker. The storage should be configured as per the application requirement where the block size on storage matches the application architecture.

A lot of automation and monitoring of the entire DC will have to be put in place including bandwidth consumption of apps which if not monitored and reported could also impact the overall delivery of the other applications running.

In order to provide the above mentioned deliverables, the intent has to be in laying a sound foundation for at least a year’s performance in advance along with the adequate investment in hardware, software, automation to meet the expectation for the defined period. This may seem to be an expensive preposition initially but the subsequent operational cost of maintenance resources, operations, audit, review would be lesser compared to the unknowns which impacts business.

In today’s evolving and dynamic IT world, most of the applications are web enabled; thus exposing them to higher risks of breach; to mitigate the associated security risks, the IT team must ensure that basics are being taken care of where regular patching/ firmware updates for all equipment’s at the DC are in place as per the OEM recommendations, testing of critical and important patches before being rolled out, signatures to be updated on all perimeter devices periodically, the data traffic trends to be carefully watched and scrutinized for changes to avoid unplanned outages in the form of targeted/ volumetric external attacks. Need of expertize, resilience and compliance obligations would also influence the decision. All of these should be regularly backed up by engaging with internal and external security audit/ compliance teams which would help ensure our IT health and security posture are as expected.