[{"data":1,"prerenderedAt":792},["ShallowReactive",2],{"/en-us/blog/tyranny-of-the-clock":3,"navigation-en-us":37,"banner-en-us":437,"footer-en-us":447,"blog-post-authors-en-us-Craig Miskell":689,"blog-related-posts-en-us-tyranny-of-the-clock":703,"assessment-promotions-en-us":743,"next-steps-en-us":782},{"id":4,"title":5,"authorSlugs":6,"body":8,"categorySlug":9,"config":10,"content":14,"description":8,"extension":26,"isFeatured":12,"meta":27,"navigation":28,"path":29,"publishedDate":20,"seo":30,"stem":34,"tagSlugs":35,"__hash__":36},"blogPosts/en-us/blog/tyranny-of-the-clock.yml","Tyranny Of The Clock",[7],"craig-miskell",null,"engineering",{"slug":11,"featured":12,"template":13},"tyranny-of-the-clock",false,"BlogPost",{"title":15,"description":16,"authors":17,"heroImage":19,"date":20,"body":21,"category":9,"tags":22},"6 Lessons we learned when debugging a scaling problem on GitLab.com","Get a closer look at how we investigated errors originating from scheduled jobs, and how we stumbled upon \"the tyranny of the clock.\"",[18],"Craig Miskell","https://res.cloudinary.com/about-gitlab-com/image/upload/v1749667913/Blog/Hero%20Images/clocks.jpg","2019-08-27","Here is a story of a scaling problem on GitLab.com: How we found it, wrestled with it, and ultimately resolved it. And how we discovered the tyranny of the clock.\n\n## The problem\n\nWe started receiving reports from customers that they were intermittently seeing errors on Git pulls from GitLab.com, typically from CI jobs or similar automated systems. The reported error message was usually:\n```yaml\nssh_exchange_identification: connection closed by remote host\nfatal: Could not read from remote repository\n```\nTo make things more difficult, the error message was intermittent and apparently unpredictable. We weren't able to reproduce it on demand, nor identify any clear indication of what was happening in graphs or logs. The error message wasn't particularly helpful either; the SSH client was being told the connection had gone away, but that could be due to anything: a flaky client or VM, a firewall we don't control, an ISP doing something strange, or an application problem at our end. We deal with a *lot* of connections to Git-over-SSH, in the order of ~26 million a day, or 300/s average, so trying to pick out a small number of failing ones out of that firehose of data was going to be difficult. It's a good thing we like a challenge.\n\n## The first clue\n\nWe got in touch with one of our customers (thanks Hubert Hölzl from Atalanda) who was seeing the problem several times a day, which gave us a foothold. Hubert was able to supply the relevant public IP address, which meant we could run some packet captures on our frontend HAproxy nodes, to attempt to isolate the problem from a smaller data set than 'All of the SSH traffic.' Even better, they were using the [alternate-ssh port](/blog/gitlab-dot-com-now-supports-an-alternate-git-plus-ssh-port/) which means we only had two HAProxy servers to look at, not 16.\n\nTrawling through these packet traces was still not fun; despite the constraints, there was ~500MB of packet capture from about 6.5 hours. We found the short-running connections, in which the TCP connection was established, the client sent a version string identifier, and then our HAProxy immediately tore down the connection with a proper TCP FIN sequence. This was the first great clue. It told us that it was definitely the GitLab.com end that was closing the connection, not something in between the client and us, meaning this was a problem we could debug.\n\n### Lesson #1: In Wireshark, the Statistics menu has a wealth of useful tools that I'd never really noticed until this endeavor.\n\nIn particular, 'Conversations' shows you a basic breakdown of time, packets, and bytes for each TCP connection in the capture, which you can sort. I *should* have used this at the start, instead of trawling through the captures manually. In hindsight, connections with small packet counts was what I was looking for, and the Conversations view shows this easily. I was then able to use this feature to find other instances, and verify that the first instance I found was not just an unusual outlier.\n\n## Diving into logs\n\nSo what was causing HAProxy to tear down the connection to the client? It certainly seemed unlikely that it was doing it arbitrarily, and there must be a deeper reason; another layer of [turtles](https://en.wikipedia.org/wiki/Turtles_all_the_way_down), if you will. The HAProxy logs seemed like the next place to check. Ours are stored/available in GCP BigQuery, which is handy because there's a lot of them, and we needed to slice 'n dice them in lots of different ways. But first, we were able to identify the log entry for one of the incidents from the packet capture, based on time and TCP ports, which was a major breakthrough. The most interesting detail in that entry was the `t_state` (Termination State) attribute, which was `SD`. From the HAProxy documentation:\n```yaml\n\n    S: aborted by the server, or the server explicitly refused it\n    D: the session was in the DATA phase.\n\n```\n`D` is pretty clear; the TCP connection had been properly established, and data was being sent, which matched the packet capture evidence. The `S` means HAProxy received an RST, or an ICMP failure message from the backend. There was no immediate clue as to which case was occurring or possible causes. It could be anything from a networking issue (e.g. glitch or congestion) to an application-level problem. Using BigQuery to aggregate by the Git backends, it was clear it wasn't specific to any VM. We needed more information.\n\nSide note: It turned out that logs with `SD` weren't unique to the problem we were seeing. On the alternate-ssh port we get a lot of scanning for HTTPS, which leads to `SD` being logged when the SSH server sees a TLS ClientHello message while expecting an SSH greeting. This created a brief detour in our investigation.\n\nOn capturing some traffic between HAProxy and the Git server and using the Wireshark statistics tools again, it was quickly obvious that SSHD on the Git server was tearing down the connection with a TCP FIN-ACK immediately after the TCP three-way handshake; HAProxy still hadn't sent the first data packet but was about to, and when it did very shortly after, the Git server responded with a TCP RST. And thus we had the reason for HAProxy to log a connection failure with `SD`. SSH was closing the connection, apparently deliberately and cleanly, with the RST being just an artifact of the SSH server receiving a packet after the FIN-ACK, and doesn't mean anything else here.\n\n## An illuminating graph\n\nWhile watching and analyzing the `SD` logs in BigQuery, it became apparent that there was quite a bit of clustering going on in the time dimension, with spikes in the first 10 seconds after the top of each minute, peaking at about 5-6 seconds past:\n\n![Connection errors grouped by second](https://gitlab.com/gitlab-com/gl-infra/infrastructure/uploads/72cd1b763c51781fa4224495f059afb5/image.png){: .shadow.medium.center}\nConnection errors, grouped by second-of-the-minute\n\n\nThis graph is created from data collated over a number of hours, so the fact that the pattern is so substantial suggests the cause is consistent across minutes and hours, and possibly even worse at specific times of the day. Even more interesting, the average spike is 3x the base load, which means we have a fun scaling problem and simply provisioning 'more resource' in terms of VMs to meet the peak loads would potentially be prohibitively expensive. This also suggested that we were hitting some hard limit, and was our first clue to an underlying systemic problem, which I have called \"the tyranny of the clock.\"\n\nCron, or similar scheduling systems, often don't have sub-minute accuracy, and if they do, it isn't used very often because humans prefer to think about things in round numbers. Consequently, jobs will run at the start of the minute or hour or at other nice round numbers. If they take a couple of seconds to do any preparations before they do a `git fetch` from GitLab.com, this would explain the connection pattern with increases a few seconds into the minute, and thus the increase in errors around those times.\n\n### Lesson #2: Apparently a lot of people have time synchronization (via NTP or otherwise) set up properly.\n\nIf they hadn't, this problem wouldn't have emerged so clearly. Yay for NTP!\n\nSo what could be causing SSH to drop the connection?\n\n## Getting close\n\nLooking through the documentation for SSHD, we found MaxStartups, which controls the maximum number of connections that can be in the pre-authenticated state. At the top of the minute, under the stampeding herd of scheduled jobs from around the internet, it seems plausible that we were exceeding the connections limit. MaxStartups actually has three numbers: the low watermark (the number at which it starts dropping connections), a percentage of connections to (randomly) drop for any connections above the low watermark, and an absolute maximum above which all new connections are dropped. The default is 10:30:100, and our setting at this time was 100:30:200, so clearly we had increased the connections in the past. Perhaps it was time to increase it again.\n\nSomewhat annoyingly, the version of openssh on our servers is 7.2, and the only way to see that MaxStartups is being breached in that version is to turn on Debug level logging. This is an absolute firehose of data, so we carefully turned it on for a short period on only one server. Thankfully within a couple of minutes it was obvious that MaxStartups was being breached, and connections were being dropped early as a result,.\n\nIt turns out that OpenSSH 7.6 (the version that comes with Ubuntu 18.04) has better logging about MaxStartups; it only requires Verbose logging to get it. While not ideal, it's better than Debug level.\n\n### Lesson #3: It is polite to log interesting information at default levels and deliberately dropping a connection for any reason is definitely interesting to system administrators.\n\nSo now that we have a cause for the problem, how can we address it? We can bump MaxStartups, but what will that cost? Definitely a small bit of memory, but would it cause any untoward downstream effects? We could only speculate, so we had to just try it. We bumped the value to 150:30:300 (a 50% increase). This had a great positive effect, and no visible negative effect (such as increased CPU load):\n\n![Before and after graph](https://gitlab.com/gitlab-com/gl-infra/production/uploads/047a4859caafc6681c9d034c202418b9/image.png){: .shadow.medium.center}\n\nBefore and after bumping MaxStartups by 50%\n\n\nNote the substantial reduction after 01:15. We've clearly eliminated a large proportion of the errors, although a non-trivial amount remained. Interestingly, these are clustered around round numbers: the top of the hour, every 30 minutes, 15 minutes, and 10 minutes. Clearly the tyranny of the clock continues. The top of the hour saw the biggest peaks, which seems reasonable in hindsight; a lot of people will simply schedule their jobs to run every hour at 0 minutes past the hour. This finding was more evidence that confirms our theory that it was scheduled jobs causing the spikes, and that we were on the right path with this error being due to a numerical limit.\n\nDelightfully, there were no obvious negative effects. CPU usage on the SSH servers stayed about the same and didn't cause any noticeable increase in load. Even though we were unleashing more connections that would previously have been dropped, and doing so at the busiest times. This was promising.\n\n## Rate limiting\n\nAt this point we weren't keen on simply bumping MaxStartups higher; while our 50% increase to-date had worked, it felt pretty crude to keep on pushing this arbitrarily higher. Surely there was something else we could do.\n\nMy search took me to the HAProxy layer that we have in front of the SSH servers. HAProxy has a nice 'rate-limit sessions' option for its frontend listeners. When configured, it constrains the new TCP connections per-second that the frontend will pass through to backends, and leaves additional incoming connections on the TCP socket. If the incoming rate exceeds the limit (measured every millisecond) the new connections are simply delayed. The TCP client (SSH in this case) simply sees a delay before the TCP connection is established, which is delightfully graceful, in my opinion. As long as the overall rate never spiked too high above the limit for too long, we'd be fine.\n\nThe next question was what number we should use. This is complicated by the fact that we have 27 SSH backends, and 18 HAproxy frontends (16 main, two alt-ssh), and the frontends don't coordinate amongst themselves for this rate limiting. We also had to take into account how long it takes a new SSH session to make it past authentication: Assuming MaxStartups of 150, if the auth phase took two seconds we could only send 75 new sessions per second to the each backend. The [note on the issue](https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/7168#note_191678023) has the derivation of the math, and I won't recount it in detail here, except to note that there are four quantities needed to calculate the rate-limit: the counts of both server types, the value of MaxStartups, and `T`, which is how long the SSH session takes to auth. `T` is critical, but we could only estimate it. You might speculate how well I did at this estimate, but that would spoil the story. I went with two seconds for now, and came to a rate limit per frontend of approximately 112.5, and rounded down to 110.\n\nWe deployed. Everything was happy, yes? Errors tended to zero, and children danced happily in the streets? Well, not so much. This change had no visible effect on the error rates. I will be honest here, and say I was rather distressed. We had missed something important, or misunderstood the problem space entirely.\n\nSo we went back to logs (and eventually the HAProxy metrics), and were able to verify that the rate limiting was at least working to limit to the number we specified, and that historically this number had been higher, so we were successfully constraining the rate at which connections were being dispatched. But clearly the rate was still too high, and not only that, it wasn't even *close* enough to the right number to have a measurable impact. Looking at the selection of backends (as logged by HAproxy) showed an oddity: At the top of the hour, the backend connections were not evenly distributed across all the SSH servers. In the sample time chosen, it varied from 30 to 121 in a given second, meaning our load balancing wasn't very balanced. Reviewing the configuration showed we were using `balance source`, so that a given client IP address would always connect to the same backend. This might be good if you needed session stickiness, but this is SSH and we have no such need. It was deliberately chosen some time ago, but there was no record as to why. We couldn't come up with a good reason to keep it, so we tried changing to leastconn, which distributes new incoming connections to the backend with the least number of current connections. This was the result, of the CPU usage on our SSH (Git) fleet:\n\n![Leastconn before and after](https://gitlab.com/gitlab-com/gl-infra/infrastructure/uploads/b006877c1e45ad0255a316a96750402c/before-after-leastconn-change.png){: .shadow.medium.center}\n\nBefore and after turning on leastconn\n\n\nClearly leastconn was a good idea. The two low-usage lines are our [Canary](https://handbook.gitlab.com/handbook/engineering/infrastructure/library/canary/) servers and can be ignored, but the spread on the others before the change was 2:1 (30% to 60%), so clearly some of our backends were much busier than others due to the source IP hashing. This was surprising to me; it seemed reasonable to expect the range of client IPs to be sufficient to spread the load much more evenly, but apparently a few large outliers were enough to skew the usage significantly.\n\n### Lesson #4: When you choose specific non-default settings, leave a comment or link to documentation/issues as to why, future people will thank you.\n\n This transparency is [one of GitLab's core values](https://handbook.gitlab.com/handbook/values/#say-why-not-just-what).\n\nTurning on leastconn also helped reduce the error rates, so it is something we wanted to continue with. In the spirit of experimenting, we dropped the rate limit lower to 100, which further reduced the error rate, suggesting that perhaps the initial estimate for `T` was wrong. But if so, it was too small, leading to the rate limit being too high, and even 100/s felt pretty low and we weren't keen to drop it further. Unfortunately for some operational reasons these two changes were just an experiment, and we had to roll back to `balance source` and rate limit of 100.\n\nWith the rate limit as low as we were comfortable with, and leastconn insufficient, we tried increasing MaxStartups: first to 200 with some effect, then to 250. Lo, the errors all but disappeared, and nothing bad happened.\n\n### Lesson #5: As scary as it looks, MaxStartups appears to have very little performance impact even if it's raised much higher than the default.\n\nThis is probably a large and powerful lever we can pull in future, if necessary. It's possible we might notice problems if it gets into the thousands or tens of thousands, but we're a long way from that.\n\nWhat does this say about my estimate for `T`, the time to establish and authenticate an SSH session? Reverse engineering the equation, knowing that 200 wasn't quite enough for MaxStartups, and 250 is enough, we could calculate that `T` is probably between 2.7 and 3.4 seconds. So the estimate of two seconds wasn't far off, but the actual value was definitely higher than expected. We'll come back to this a bit later.\n\n## Final steps\n\nLooking at the logs again in hindsight, and after some contemplation, we discovered that we could identify this specific failure with t_state being `SD` and b_read (bytes read by client) of 0. As noted above, we handle approximately 26-28 million SSH connections per day. It was unpleasant to discover that at the worst of the problem, roughly 1.5% of those connections were being dropped badly. Clearly the problem was bigger than we had realised at the start. There was nothing about this that we couldn't have identified earlier (right back when we discovered that t_state=\"SD\" was indicative of the issue), but we didn't think to do so, and we should have. It might have increased how much effort we put in.\n\n### Lesson #6: Measure the actual rate of your errors as early as possible.\n\nWe might have put a higher priority on this earlier had we realized the extent of the problem, although it was still dependent on knowing the identifying characteristic.\n\nOn the plus side, after our bumps to MaxStartups and rate limiting, the error rate was down to 0.001%, or a few thousand per day. This was better, but still higher than we liked. After we unblocked some other operational matters, we were able to formally deploy the leastconn change, and the errors were eliminated entirely. We could breathe easy again.\n\n## Further work\n\nClearly the SSH authentication phase is still taking quite a while, perhaps up to 3.4 seconds. GitLab can use [AuthorizedKeysCommand](https://docs.gitlab.com/ee/administration/operations/fast_ssh_key_lookup.html) to look up the SSH key directly in the database. This is critical for speedy operations when you have a large number of users, otherwise SSHD has to sequentially read a very large `authorized_keys` file to look up the public key of the user, and this doesn't scale well. We implement the lookup with a little bit of ruby that calls an internal HTTP API. [Stan Hu](/company/team/#stanhu), engineering fellow and our resident source of GitLab knowledge, identified that the unicorn instances on the Git/SSH servers were experiencing substantial queuing. This could be a significant contributor to the ~3-second pre-authentication stage, and therefore something we need to look at further, so investigations continue. We may increase the number of unicorn (or puma) workers on these nodes, so there's always a worker available for SSH. However, that isn't without risk, so we will need to be careful and measure well. Work continues, but slower now that the core user problem has been mitigated. We may eventually be able to reduce MaxStartups, although given the lack of negative impact it seems to have, there's little need. It would make everyone more comfortable if OpenSSH let us see the how close we were to hitting MaxStartups at any point, rather than having to go in blind and only find out we were close when the limit is breached and connections are dropped.\n\nWe also need to alert when we see HAProxy logs that indicate the problem is occurring, because in practice there's no reason it should ever happen. If it does, we need to increase MaxStartups further, or if resources are constrained, add more Git/SSH nodes.\n\n## Conclusion\n\nComplex systems have complex interactions, and there is often more than one lever that can be used to control various bottlenecks. It's good to know what tools are available because they often have trade-offs. Assumptions and estimates can also be risky. In hindsight, I would have attempted to get a much better measurement of how long authentication takes, so that my `T` estimate was better.\n\nBut the biggest lesson is that when large numbers of people schedule jobs at round numbers on the clock, it leads to really interesting scaling problems for centralized service providers like GitLab. If you're one of them, you might like to consider putting in a random sleep of maybe 30 seconds at the start, or pick a random time during the hour *and* put in the random sleep, just to be polite and fight the tyranny of the clock.\n\nCover image by [Jon Tyson](https://unsplash.com/@jontyson) on [Unsplash](https://unsplash.com)\n",[23,24,25],"git","performance","production","yml",{},true,"/en-us/blog/tyranny-of-the-clock",{"title":15,"description":16,"ogTitle":15,"ogDescription":16,"noIndex":12,"ogImage":19,"ogUrl":31,"ogSiteName":32,"ogType":33,"canonicalUrls":31},"https://about.gitlab.com/blog/tyranny-of-the-clock","https://about.gitlab.com","article","en-us/blog/tyranny-of-the-clock",[23,24,25],"ZgEv3qw3ttdl6tfqABKwBg4ZorgPwuJcc9r3bJwS9GU",{"data":38},{"logo":39,"freeTrial":44,"sales":49,"login":54,"items":59,"search":367,"minimal":398,"duo":417,"pricingDeployment":427},{"config":40},{"href":41,"dataGaName":42,"dataGaLocation":43},"/","gitlab logo","header",{"text":45,"config":46},"Get free trial",{"href":47,"dataGaName":48,"dataGaLocation":43},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com&glm_content=default-saas-trial/","free trial",{"text":50,"config":51},"Talk to sales",{"href":52,"dataGaName":53,"dataGaLocation":43},"/sales/","sales",{"text":55,"config":56},"Sign in",{"href":57,"dataGaName":58,"dataGaLocation":43},"https://gitlab.com/users/sign_in/","sign in",[60,87,182,187,288,348],{"text":61,"config":62,"cards":64},"Platform",{"dataNavLevelOne":63},"platform",[65,71,79],{"title":61,"description":66,"link":67},"The intelligent orchestration platform for DevSecOps",{"text":68,"config":69},"Explore our Platform",{"href":70,"dataGaName":63,"dataGaLocation":43},"/platform/",{"title":72,"description":73,"link":74},"GitLab Duo Agent Platform","Agentic AI for the entire software lifecycle",{"text":75,"config":76},"Meet GitLab Duo",{"href":77,"dataGaName":78,"dataGaLocation":43},"/gitlab-duo-agent-platform/","gitlab duo agent platform",{"title":80,"description":81,"link":82},"Why GitLab","See the top reasons enterprises choose GitLab",{"text":83,"config":84},"Learn more",{"href":85,"dataGaName":86,"dataGaLocation":43},"/why-gitlab/","why gitlab",{"text":88,"left":28,"config":89,"link":91,"lists":95,"footer":164},"Product",{"dataNavLevelOne":90},"solutions",{"text":92,"config":93},"View all Solutions",{"href":94,"dataGaName":90,"dataGaLocation":43},"/solutions/",[96,120,143],{"title":97,"description":98,"link":99,"items":104},"Automation","CI/CD and automation to accelerate deployment",{"config":100},{"icon":101,"href":102,"dataGaName":103,"dataGaLocation":43},"AutomatedCodeAlt","/solutions/delivery-automation/","automated software delivery",[105,109,112,116],{"text":106,"config":107},"CI/CD",{"href":108,"dataGaLocation":43,"dataGaName":106},"/solutions/continuous-integration/",{"text":72,"config":110},{"href":77,"dataGaLocation":43,"dataGaName":111},"gitlab duo agent platform - product menu",{"text":113,"config":114},"Source Code Management",{"href":115,"dataGaLocation":43,"dataGaName":113},"/solutions/source-code-management/",{"text":117,"config":118},"Automated Software Delivery",{"href":102,"dataGaLocation":43,"dataGaName":119},"Automated software delivery",{"title":121,"description":122,"link":123,"items":128},"Security","Deliver code faster without compromising security",{"config":124},{"href":125,"dataGaName":126,"dataGaLocation":43,"icon":127},"/solutions/application-security-testing/","security and compliance","ShieldCheckLight",[129,133,138],{"text":130,"config":131},"Application Security Testing",{"href":125,"dataGaName":132,"dataGaLocation":43},"Application security testing",{"text":134,"config":135},"Software Supply Chain Security",{"href":136,"dataGaLocation":43,"dataGaName":137},"/solutions/supply-chain/","Software supply chain security",{"text":139,"config":140},"Software Compliance",{"href":141,"dataGaName":142,"dataGaLocation":43},"/solutions/software-compliance/","software compliance",{"title":144,"link":145,"items":150},"Measurement",{"config":146},{"icon":147,"href":148,"dataGaName":149,"dataGaLocation":43},"DigitalTransformation","/solutions/visibility-measurement/","visibility and measurement",[151,155,159],{"text":152,"config":153},"Visibility & Measurement",{"href":148,"dataGaLocation":43,"dataGaName":154},"Visibility and Measurement",{"text":156,"config":157},"Value Stream Management",{"href":158,"dataGaLocation":43,"dataGaName":156},"/solutions/value-stream-management/",{"text":160,"config":161},"Analytics & Insights",{"href":162,"dataGaLocation":43,"dataGaName":163},"/solutions/analytics-and-insights/","Analytics and insights",{"title":165,"items":166},"GitLab for",[167,172,177],{"text":168,"config":169},"Enterprise",{"href":170,"dataGaLocation":43,"dataGaName":171},"/enterprise/","enterprise",{"text":173,"config":174},"Small Business",{"href":175,"dataGaLocation":43,"dataGaName":176},"/small-business/","small business",{"text":178,"config":179},"Public Sector",{"href":180,"dataGaLocation":43,"dataGaName":181},"/solutions/public-sector/","public sector",{"text":183,"config":184},"Pricing",{"href":185,"dataGaName":186,"dataGaLocation":43,"dataNavLevelOne":186},"/pricing/","pricing",{"text":188,"config":189,"link":191,"lists":195,"feature":275},"Resources",{"dataNavLevelOne":190},"resources",{"text":192,"config":193},"View all resources",{"href":194,"dataGaName":190,"dataGaLocation":43},"/resources/",[196,229,247],{"title":197,"items":198},"Getting started",[199,204,209,214,219,224],{"text":200,"config":201},"Install",{"href":202,"dataGaName":203,"dataGaLocation":43},"/install/","install",{"text":205,"config":206},"Quick start guides",{"href":207,"dataGaName":208,"dataGaLocation":43},"/get-started/","quick setup checklists",{"text":210,"config":211},"Learn",{"href":212,"dataGaLocation":43,"dataGaName":213},"https://university.gitlab.com/","learn",{"text":215,"config":216},"Product documentation",{"href":217,"dataGaName":218,"dataGaLocation":43},"https://docs.gitlab.com/","product documentation",{"text":220,"config":221},"Best practice videos",{"href":222,"dataGaName":223,"dataGaLocation":43},"/getting-started-videos/","best practice videos",{"text":225,"config":226},"Integrations",{"href":227,"dataGaName":228,"dataGaLocation":43},"/integrations/","integrations",{"title":230,"items":231},"Discover",[232,237,242],{"text":233,"config":234},"Customer success stories",{"href":235,"dataGaName":236,"dataGaLocation":43},"/customers/","customer success stories",{"text":238,"config":239},"Blog",{"href":240,"dataGaName":241,"dataGaLocation":43},"/blog/","blog",{"text":243,"config":244},"Remote",{"href":245,"dataGaName":246,"dataGaLocation":43},"https://handbook.gitlab.com/handbook/company/culture/all-remote/","remote",{"title":248,"items":249},"Connect",[250,255,260,265,270],{"text":251,"config":252},"GitLab Services",{"href":253,"dataGaName":254,"dataGaLocation":43},"/services/","services",{"text":256,"config":257},"Community",{"href":258,"dataGaName":259,"dataGaLocation":43},"/community/","community",{"text":261,"config":262},"Forum",{"href":263,"dataGaName":264,"dataGaLocation":43},"https://forum.gitlab.com/","forum",{"text":266,"config":267},"Events",{"href":268,"dataGaName":269,"dataGaLocation":43},"/events/","events",{"text":271,"config":272},"Partners",{"href":273,"dataGaName":274,"dataGaLocation":43},"/partners/","partners",{"backgroundColor":276,"textColor":277,"text":278,"image":279,"link":283},"#2f2a6b","#fff","Insights for the future of software development",{"altText":280,"config":281},"the source promo card",{"src":282},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758208064/dzl0dbift9xdizyelkk4.svg",{"text":284,"config":285},"Read the latest",{"href":286,"dataGaName":287,"dataGaLocation":43},"/the-source/","the source",{"text":289,"config":290,"lists":292},"Company",{"dataNavLevelOne":291},"company",[293],{"items":294},[295,300,306,308,313,318,323,328,333,338,343],{"text":296,"config":297},"About",{"href":298,"dataGaName":299,"dataGaLocation":43},"/company/","about",{"text":301,"config":302,"footerGa":305},"Jobs",{"href":303,"dataGaName":304,"dataGaLocation":43},"/jobs/","jobs",{"dataGaName":304},{"text":266,"config":307},{"href":268,"dataGaName":269,"dataGaLocation":43},{"text":309,"config":310},"Leadership",{"href":311,"dataGaName":312,"dataGaLocation":43},"/company/team/e-group/","leadership",{"text":314,"config":315},"Team",{"href":316,"dataGaName":317,"dataGaLocation":43},"/company/team/","team",{"text":319,"config":320},"Handbook",{"href":321,"dataGaName":322,"dataGaLocation":43},"https://handbook.gitlab.com/","handbook",{"text":324,"config":325},"Investor relations",{"href":326,"dataGaName":327,"dataGaLocation":43},"https://ir.gitlab.com/","investor relations",{"text":329,"config":330},"Trust Center",{"href":331,"dataGaName":332,"dataGaLocation":43},"/security/","trust center",{"text":334,"config":335},"AI Transparency Center",{"href":336,"dataGaName":337,"dataGaLocation":43},"/ai-transparency-center/","ai transparency center",{"text":339,"config":340},"Newsletter",{"href":341,"dataGaName":342,"dataGaLocation":43},"/company/contact/#contact-forms","newsletter",{"text":344,"config":345},"Press",{"href":346,"dataGaName":347,"dataGaLocation":43},"/press/","press",{"text":349,"config":350,"lists":351},"Contact us",{"dataNavLevelOne":291},[352],{"items":353},[354,357,362],{"text":50,"config":355},{"href":52,"dataGaName":356,"dataGaLocation":43},"talk to sales",{"text":358,"config":359},"Support portal",{"href":360,"dataGaName":361,"dataGaLocation":43},"https://support.gitlab.com","support portal",{"text":363,"config":364},"Customer portal",{"href":365,"dataGaName":366,"dataGaLocation":43},"https://customers.gitlab.com/customers/sign_in/","customer portal",{"close":368,"login":369,"suggestions":376},"Close",{"text":370,"link":371},"To search repositories and projects, login to",{"text":372,"config":373},"gitlab.com",{"href":57,"dataGaName":374,"dataGaLocation":375},"search login","search",{"text":377,"default":378},"Suggestions",[379,381,385,387,391,395],{"text":72,"config":380},{"href":77,"dataGaName":72,"dataGaLocation":375},{"text":382,"config":383},"Code Suggestions (AI)",{"href":384,"dataGaName":382,"dataGaLocation":375},"/solutions/code-suggestions/",{"text":106,"config":386},{"href":108,"dataGaName":106,"dataGaLocation":375},{"text":388,"config":389},"GitLab on AWS",{"href":390,"dataGaName":388,"dataGaLocation":375},"/partners/technology-partners/aws/",{"text":392,"config":393},"GitLab on Google Cloud",{"href":394,"dataGaName":392,"dataGaLocation":375},"/partners/technology-partners/google-cloud-platform/",{"text":396,"config":397},"Why GitLab?",{"href":85,"dataGaName":396,"dataGaLocation":375},{"freeTrial":399,"mobileIcon":404,"desktopIcon":409,"secondaryButton":412},{"text":400,"config":401},"Start free trial",{"href":402,"dataGaName":48,"dataGaLocation":403},"https://gitlab.com/-/trials/new/","nav",{"altText":405,"config":406},"Gitlab Icon",{"src":407,"dataGaName":408,"dataGaLocation":403},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203874/jypbw1jx72aexsoohd7x.svg","gitlab icon",{"altText":405,"config":410},{"src":411,"dataGaName":408,"dataGaLocation":403},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203875/gs4c8p8opsgvflgkswz9.svg",{"text":413,"config":414},"Get Started",{"href":415,"dataGaName":416,"dataGaLocation":403},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com/get-started/","get started",{"freeTrial":418,"mobileIcon":423,"desktopIcon":425},{"text":419,"config":420},"Learn more about GitLab Duo",{"href":421,"dataGaName":422,"dataGaLocation":403},"/gitlab-duo/","gitlab duo",{"altText":405,"config":424},{"src":407,"dataGaName":408,"dataGaLocation":403},{"altText":405,"config":426},{"src":411,"dataGaName":408,"dataGaLocation":403},{"freeTrial":428,"mobileIcon":433,"desktopIcon":435},{"text":429,"config":430},"Back to pricing",{"href":185,"dataGaName":431,"dataGaLocation":403,"icon":432},"back to pricing","GoBack",{"altText":405,"config":434},{"src":407,"dataGaName":408,"dataGaLocation":403},{"altText":405,"config":436},{"src":411,"dataGaName":408,"dataGaLocation":403},{"title":438,"button":439,"config":444},"See how agentic AI transforms software delivery",{"text":440,"config":441},"Watch GitLab Transcend now",{"href":442,"dataGaName":443,"dataGaLocation":43},"/events/transcend/virtual/","transcend event",{"layout":445,"icon":446},"release","AiStar",{"data":448},{"text":449,"source":450,"edit":456,"contribute":461,"config":466,"items":471,"minimal":678},"Git is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license",{"text":451,"config":452},"View page source",{"href":453,"dataGaName":454,"dataGaLocation":455},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/","page source","footer",{"text":457,"config":458},"Edit this page",{"href":459,"dataGaName":460,"dataGaLocation":455},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/content/","web ide",{"text":462,"config":463},"Please contribute",{"href":464,"dataGaName":465,"dataGaLocation":455},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/CONTRIBUTING.md/","please contribute",{"twitter":467,"facebook":468,"youtube":469,"linkedin":470},"https://twitter.com/gitlab","https://www.facebook.com/gitlab","https://www.youtube.com/channel/UCnMGQ8QHMAnVIsI3xJrihhg","https://www.linkedin.com/company/gitlab-com",[472,519,573,617,644],{"title":183,"links":473,"subMenu":488},[474,478,483],{"text":475,"config":476},"View plans",{"href":185,"dataGaName":477,"dataGaLocation":455},"view plans",{"text":479,"config":480},"Why Premium?",{"href":481,"dataGaName":482,"dataGaLocation":455},"/pricing/premium/","why premium",{"text":484,"config":485},"Why Ultimate?",{"href":486,"dataGaName":487,"dataGaLocation":455},"/pricing/ultimate/","why ultimate",[489],{"title":490,"links":491},"Contact Us",[492,495,497,499,504,509,514],{"text":493,"config":494},"Contact sales",{"href":52,"dataGaName":53,"dataGaLocation":455},{"text":358,"config":496},{"href":360,"dataGaName":361,"dataGaLocation":455},{"text":363,"config":498},{"href":365,"dataGaName":366,"dataGaLocation":455},{"text":500,"config":501},"Status",{"href":502,"dataGaName":503,"dataGaLocation":455},"https://status.gitlab.com/","status",{"text":505,"config":506},"Terms of use",{"href":507,"dataGaName":508,"dataGaLocation":455},"/terms/","terms of use",{"text":510,"config":511},"Privacy statement",{"href":512,"dataGaName":513,"dataGaLocation":455},"/privacy/","privacy statement",{"text":515,"config":516},"Cookie preferences",{"dataGaName":517,"dataGaLocation":455,"id":518,"isOneTrustButton":28},"cookie preferences","ot-sdk-btn",{"title":88,"links":520,"subMenu":529},[521,525],{"text":522,"config":523},"DevSecOps platform",{"href":70,"dataGaName":524,"dataGaLocation":455},"devsecops platform",{"text":526,"config":527},"AI-Assisted Development",{"href":421,"dataGaName":528,"dataGaLocation":455},"ai-assisted development",[530],{"title":531,"links":532},"Topics",[533,538,543,548,553,558,563,568],{"text":534,"config":535},"CICD",{"href":536,"dataGaName":537,"dataGaLocation":455},"/topics/ci-cd/","cicd",{"text":539,"config":540},"GitOps",{"href":541,"dataGaName":542,"dataGaLocation":455},"/topics/gitops/","gitops",{"text":544,"config":545},"DevOps",{"href":546,"dataGaName":547,"dataGaLocation":455},"/topics/devops/","devops",{"text":549,"config":550},"Version Control",{"href":551,"dataGaName":552,"dataGaLocation":455},"/topics/version-control/","version control",{"text":554,"config":555},"DevSecOps",{"href":556,"dataGaName":557,"dataGaLocation":455},"/topics/devsecops/","devsecops",{"text":559,"config":560},"Cloud Native",{"href":561,"dataGaName":562,"dataGaLocation":455},"/topics/cloud-native/","cloud native",{"text":564,"config":565},"AI for Coding",{"href":566,"dataGaName":567,"dataGaLocation":455},"/topics/devops/ai-for-coding/","ai for coding",{"text":569,"config":570},"Agentic AI",{"href":571,"dataGaName":572,"dataGaLocation":455},"/topics/agentic-ai/","agentic ai",{"title":574,"links":575},"Solutions",[576,578,580,585,589,592,596,599,601,604,607,612],{"text":130,"config":577},{"href":125,"dataGaName":130,"dataGaLocation":455},{"text":119,"config":579},{"href":102,"dataGaName":103,"dataGaLocation":455},{"text":581,"config":582},"Agile development",{"href":583,"dataGaName":584,"dataGaLocation":455},"/solutions/agile-delivery/","agile delivery",{"text":586,"config":587},"SCM",{"href":115,"dataGaName":588,"dataGaLocation":455},"source code management",{"text":534,"config":590},{"href":108,"dataGaName":591,"dataGaLocation":455},"continuous integration & delivery",{"text":593,"config":594},"Value stream management",{"href":158,"dataGaName":595,"dataGaLocation":455},"value stream management",{"text":539,"config":597},{"href":598,"dataGaName":542,"dataGaLocation":455},"/solutions/gitops/",{"text":168,"config":600},{"href":170,"dataGaName":171,"dataGaLocation":455},{"text":602,"config":603},"Small business",{"href":175,"dataGaName":176,"dataGaLocation":455},{"text":605,"config":606},"Public sector",{"href":180,"dataGaName":181,"dataGaLocation":455},{"text":608,"config":609},"Education",{"href":610,"dataGaName":611,"dataGaLocation":455},"/solutions/education/","education",{"text":613,"config":614},"Financial services",{"href":615,"dataGaName":616,"dataGaLocation":455},"/solutions/finance/","financial services",{"title":188,"links":618},[619,621,623,625,628,630,632,634,636,638,640,642],{"text":200,"config":620},{"href":202,"dataGaName":203,"dataGaLocation":455},{"text":205,"config":622},{"href":207,"dataGaName":208,"dataGaLocation":455},{"text":210,"config":624},{"href":212,"dataGaName":213,"dataGaLocation":455},{"text":215,"config":626},{"href":217,"dataGaName":627,"dataGaLocation":455},"docs",{"text":238,"config":629},{"href":240,"dataGaName":241,"dataGaLocation":455},{"text":233,"config":631},{"href":235,"dataGaName":236,"dataGaLocation":455},{"text":243,"config":633},{"href":245,"dataGaName":246,"dataGaLocation":455},{"text":251,"config":635},{"href":253,"dataGaName":254,"dataGaLocation":455},{"text":256,"config":637},{"href":258,"dataGaName":259,"dataGaLocation":455},{"text":261,"config":639},{"href":263,"dataGaName":264,"dataGaLocation":455},{"text":266,"config":641},{"href":268,"dataGaName":269,"dataGaLocation":455},{"text":271,"config":643},{"href":273,"dataGaName":274,"dataGaLocation":455},{"title":289,"links":645},[646,648,650,652,654,656,658,662,667,669,671,673],{"text":296,"config":647},{"href":298,"dataGaName":291,"dataGaLocation":455},{"text":301,"config":649},{"href":303,"dataGaName":304,"dataGaLocation":455},{"text":309,"config":651},{"href":311,"dataGaName":312,"dataGaLocation":455},{"text":314,"config":653},{"href":316,"dataGaName":317,"dataGaLocation":455},{"text":319,"config":655},{"href":321,"dataGaName":322,"dataGaLocation":455},{"text":324,"config":657},{"href":326,"dataGaName":327,"dataGaLocation":455},{"text":659,"config":660},"Sustainability",{"href":661,"dataGaName":659,"dataGaLocation":455},"/sustainability/",{"text":663,"config":664},"Diversity, inclusion and belonging (DIB)",{"href":665,"dataGaName":666,"dataGaLocation":455},"/diversity-inclusion-belonging/","Diversity, inclusion and belonging",{"text":329,"config":668},{"href":331,"dataGaName":332,"dataGaLocation":455},{"text":339,"config":670},{"href":341,"dataGaName":342,"dataGaLocation":455},{"text":344,"config":672},{"href":346,"dataGaName":347,"dataGaLocation":455},{"text":674,"config":675},"Modern Slavery Transparency Statement",{"href":676,"dataGaName":677,"dataGaLocation":455},"https://handbook.gitlab.com/handbook/legal/modern-slavery-act-transparency-statement/","modern slavery transparency statement",{"items":679},[680,683,686],{"text":681,"config":682},"Terms",{"href":507,"dataGaName":508,"dataGaLocation":455},{"text":684,"config":685},"Cookies",{"dataGaName":517,"dataGaLocation":455,"id":518,"isOneTrustButton":28},{"text":687,"config":688},"Privacy",{"href":512,"dataGaName":513,"dataGaLocation":455},[690],{"id":691,"title":18,"body":8,"config":692,"content":694,"description":8,"extension":26,"meta":698,"navigation":28,"path":699,"seo":700,"stem":701,"__hash__":702},"blogAuthors/en-us/blog/authors/craig-miskell.yml",{"template":693},"BlogAuthor",{"name":18,"config":695},{"headshot":696,"ctfId":697},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1749667372/Blog/Author%20Headshots/cmiskell-headshot.jpg","cmiskell",{},"/en-us/blog/authors/craig-miskell",{},"en-us/blog/authors/craig-miskell","VPOHmA7ISNMfEE3Y0SkxSJD1DRgIqQVOLz9bf8UCnh8",[704,719,732],{"content":705,"config":717},{"title":706,"description":707,"authors":708,"heroImage":710,"date":711,"body":712,"category":9,"tags":713},"How to use GitLab Container Virtual Registry with Docker Hardened Images","Learn how to simplify container image management with this step-by-step guide.",[709],"Tim Rizzi","https://res.cloudinary.com/about-gitlab-com/image/upload/v1772111172/mwhgbjawn62kymfwrhle.png","2026-03-12","If you're a platform engineer, you've probably had this conversation:\n  \n*\"Security says we need to use hardened base images.\"*\n\n*\"Great, where do I configure credentials for yet another registry?\"*\n\n*\"Also, how do we make sure everyone actually uses them?\"*\n\nOr this one:\n\n*\"Why are our builds so slow?\"*\n\n*\"We're pulling the same 500MB image from Docker Hub in every single job.\"*\n\n*\"Can't we just cache these somewhere?\"*\n\nI've been working on [Container Virtual Registry](https://docs.gitlab.com/user/packages/virtual_registry/container/) at GitLab specifically to solve these problems. It's a pull-through cache that sits in front of your upstream registries — Docker Hub, dhi.io (Docker Hardened Images), MCR, and Quay — and gives your teams a single endpoint to pull from. Images get cached on the first pull. Subsequent pulls come from the cache. Your developers don't need to know or care which upstream a particular image came from.\n\nThis article shows you how to set up Container Virtual Registry, specifically with Docker Hardened Images in mind, since that's a combination that makes a lot of sense for teams concerned about security and not making their developers' lives harder.\n\n## What problem are we actually solving?\n\nThe Platform teams I usually talk to manage container images across three to five registries:\n\n* **Docker Hub** for most base images\n* **dhi.io** for Docker Hardened Images (security-conscious workloads)\n* **MCR** for .NET and Azure tooling\n* **Quay.io** for Red Hat ecosystem stuff\n* **Internal registries** for proprietary images\n\nEach one has its own:\n\n* Authentication mechanism\n* Network latency characteristics\n* Way of organizing image paths\n\nYour CI/CD configs end up littered with registry-specific logic. Credential management becomes a project unto itself. And every pipeline job pulls the same base images over the network, even though they haven't changed in weeks.\n\nContainer Virtual Registry consolidates this. One registry URL. One authentication flow (GitLab's). Cached images are served from GitLab's infrastructure rather than traversing the internet each time.\n\n## How it works\n\nThe model is straightforward:\n\n```text\nYour pipeline pulls:\n  gitlab.com/virtual_registries/container/1000016/python:3.13\n\nVirtual registry checks:\n  1. Do I have this cached? → Return it\n  2. No? → Fetch from upstream, cache it, return it\n\n```\n\nYou configure upstreams in priority order. When a pull request comes in, the virtual registry checks each upstream until it finds the image. The result gets cached for a configurable period (default 24 hours).\n\n```text\n┌─────────────────────────────────────────────────────────┐\n│                    CI/CD Pipeline                       │\n│                          │                              │\n│                          ▼                              │\n│   gitlab.com/virtual_registries/container/\u003Cid>/image   │\n└─────────────────────────────────────────────────────────┘\n                           │\n                           ▼\n┌─────────────────────────────────────────────────────────┐\n│            Container Virtual Registry                   │\n│                                                         │\n│  Upstream 1: Docker Hub ────────────────┐               │\n│  Upstream 2: dhi.io (Hardened) ────────┐│               │\n│  Upstream 3: MCR ─────────────────────┐││               │\n│  Upstream 4: Quay.io ────────────────┐│││               │\n│                                      ││││               │\n│                    ┌─────────────────┴┴┴┴──┐            │\n│                    │        Cache          │            │\n│                    │  (manifests + layers) │            │\n│                    └───────────────────────┘            │\n└─────────────────────────────────────────────────────────┘\n```\n\n## Why this matters for Docker Hardened Images\n\n[Docker Hardened Images](https://docs.docker.com/dhi/) are great because of the minimal attack surface, near-zero CVEs, proper software bills of materials (SBOMs), and SLSA provenance. If you're evaluating base images for security-sensitive workloads, they should be on your list.\n\nBut adopting them creates the same operational friction as any new registry:\n\n* **Credential distribution**: You need to get Docker credentials to every system that pulls images from dhi.io.\n* **CI/CD changes**: Every pipeline needs to be updated to authenticate with dhi.io.\n* **Developer friction**: People need to remember to use the hardened variants.\n* **Visibility gap**: It's difficulat to tell if teams are actually using hardened images vs. regular ones.\n\nVirtual registry addresses each of these:\n\n**Single credential**: Teams authenticate to GitLab. The virtual registry handles upstream authentication. You configure Docker credentials once, at the registry level, and they apply to all pulls.\n\n**No CI/CD changes per-team**: Point pipelines at your virtual registry. Done. The upstream configuration is centralized.\n\n**Gradual adoption**: Since images get cached with their full path, you can see in the cache what's being pulled. If someone's pulling `library/python:3.11` instead of the hardened variant, you'll know.\n\n**Audit trail**: The cache shows you exactly which images are in active use. Useful for compliance, useful for understanding what your fleet actually depends on.\n\n## Setting it up\n\nHere's a real setup using the Python client from this demo project.\n\n### Create the virtual registry\n\n```python\nfrom virtual_registry_client import VirtualRegistryClient\n\nclient = VirtualRegistryClient()\n\nregistry = client.create_virtual_registry(\n    group_id=\"785414\",  # Your top-level group ID\n    name=\"platform-images\",\n    description=\"Cached container images for platform teams\"\n)\n\nprint(f\"Registry ID: {registry['id']}\")\n# You'll need this ID for the pull URL\n```\n\n### Add Docker Hub as an upstream\n\nFor official images like Alpine, Python, etc.:\n\n```python\ndocker_upstream = client.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://registry-1.docker.io\",\n    name=\"Docker Hub\",\n    cache_validity_hours=24\n)\n```\n\n### Add Docker Hardened Images (dhi.io)\n\nDocker Hardened Images are hosted on `dhi.io`, a separate registry that requires authentication:\n\n```python\ndhi_upstream = client.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://dhi.io\",\n    name=\"Docker Hardened Images\",\n    username=\"your-docker-username\",\n    password=\"your-docker-access-token\",\n    cache_validity_hours=24\n)\n```\n\n### Add other upstreams\n\n```python\n# MCR for .NET teams\nclient.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://mcr.microsoft.com\",\n    name=\"Microsoft Container Registry\",\n    cache_validity_hours=48\n)\n\n# Quay for Red Hat stuff\nclient.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://quay.io\",\n    name=\"Quay.io\",\n    cache_validity_hours=24\n)\n```\n\n### Update your CI/CD\n\nHere's a `.gitlab-ci.yml` that pulls through the virtual registry:\n\n```yaml\nvariables:\n  VIRTUAL_REGISTRY_ID: \u003Cyour_virtual_registry_ID>\n\n  \nbuild:\n  image: docker:24\n  services:\n    - docker:24-dind\n  before_script:\n    # Authenticate to GitLab (which handles upstream auth for you)\n    - echo \"${CI_JOB_TOKEN}\" | docker login -u gitlab-ci-token --password-stdin gitlab.com\n  script:\n    # All of these go through your single virtual registry\n    \n    # Official Docker Hub images (use library/ prefix)\n    - docker pull gitlab.com/virtual_registries/container/${VIRTUAL_REGISTRY_ID}/library/alpine:latest\n    \n    # Docker Hardened Images from dhi.io (no prefix needed)\n    - docker pull gitlab.com/virtual_registries/container/${VIRTUAL_REGISTRY_ID}/python:3.13\n    \n    # .NET from MCR\n    - docker pull gitlab.com/virtual_registries/container/${VIRTUAL_REGISTRY_ID}/dotnet/sdk:8.0\n```\n\n### Image path formats\n\nDifferent registries use different path conventions:\n\n| Registry | Pull URL Example |\n|----------|------------------|\n| Docker Hub (official) | `.../library/python:3.11-slim` |\n| Docker Hardened Images (dhi.io) | `.../python:3.13` |\n| MCR | `.../dotnet/sdk:8.0` |\n| Quay.io | `.../prometheus/prometheus:latest` |\n\n### Verify it's working\n\nAfter some pulls, check your cache:\n\n```python\nupstreams = client.list_registry_upstreams(registry['id'])\nfor upstream in upstreams:\n    entries = client.list_cache_entries(upstream['id'])\n    print(f\"{upstream['name']}: {len(entries)} cached entries\")\n\n```\n\n## What the numbers look like\n\nI ran tests pulling images through the virtual registry:\n\n| Metric | Without Cache | With Warm Cache |\n|--------|---------------|-----------------|\n| Pull time (Alpine) | 10.3s | 4.2s |\n| Pull time (Python 3.13 DHI) | 11.6s | ~4s |\n| Network roundtrips to upstream | Every pull | Cache misses only |\n\n\n\n\nThe first pull is the same speed (it has to fetch from upstream). Every pull after that, for the cache validity period, comes straight from GitLab's storage. No network hop to Docker Hub, dhi.io, MCR, or wherever the image lives.\n\nFor a team running hundreds of pipeline jobs per day, that's hours of cumulative build time saved.\n\n## Practical considerations\nHere are some considerations to keep in mind:\n\n### Cache validity\n\n24 hours is the default. For security-sensitive images where you want patches quickly, consider 12 hours or less:\n\n```python\nclient.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://dhi.io\",\n    name=\"Docker Hardened Images\",\n    username=\"your-username\",\n    password=\"your-token\",\n    cache_validity_hours=12\n)\n```\n\nFor stable, infrequently-updated images (like specific version tags), longer validity is fine.\n\n### Upstream priority\n\nUpstreams are checked in order. If you have images with the same name on different registries, the first matching upstream wins.\n\n### Limits\n\n* Maximum of 20 virtual registries per group\n* Maximum of 20 upstreams per virtual registry\n\n## Configuration via UI\n\nYou can also configure virtual registries and upstreams directly from the GitLab UI—no API calls required. Navigate to your group's **Settings > Packages and registries > Virtual Registry** to:\n\n* Create and manage virtual registries\n* Add, edit, and reorder upstream registries\n* View and manage the cache\n* Monitor which images are being pulled\n\n## What's next\n\nWe're actively developing:\n\n* **Allow/deny lists**: Use regex to control which images can be pulled from specific upstreams.\n\nThis is beta software. It works, people are using it in production, but we're still iterating based on feedback.\n\n## Share your feedback\n\nIf you're a platform engineer dealing with container registry sprawl, I'd like to understand your setup:\n\n* How many upstream registries are you managing?\n* What's your biggest pain point with the current state?\n* Would something like this help, and if not, what's missing?\n\nPlease share your experiences in the [Container Virtual Registry feedback issue](https://gitlab.com/gitlab-org/gitlab/-/work_items/589630).\n## Related resources\n- [New GitLab metrics and registry features help reduce CI/CD bottlenecks](https://about.gitlab.com/blog/new-gitlab-metrics-and-registry-features-help-reduce-ci-cd-bottlenecks/#container-virtual-registry)\n- [Container Virtual Registry documentation](https://docs.gitlab.com/user/packages/virtual_registry/container/)\n- [Container Virtual Registry API](https://docs.gitlab.com/api/container_virtual_registries/)",[714,715,716],"tutorial","product","features",{"featured":12,"template":13,"slug":718},"using-gitlab-container-virtual-registry-with-docker-hardened-images",{"content":720,"config":730},{"title":721,"description":722,"authors":723,"heroImage":725,"date":726,"category":9,"tags":727,"body":729},"How IIT Bombay students are coding the future with GitLab","At GitLab, we often talk about how software accelerates innovation. But sometimes, you have to step away from the Zoom calls and stand in a crowded university hall to remember why we do this.",[724],"Nick Veenhof","https://res.cloudinary.com/about-gitlab-com/image/upload/v1750099013/Blog/Hero%20Images/Blog/Hero%20Images/blog-image-template-1800x945%20%2814%29_6VTUA8mUhOZNDaRVNPeKwl_1750099012960.png","2026-01-08",[259,611,728],"open source","The GitLab team recently had the privilege of judging the **iHack Hackathon** at **IIT Bombay's E-Summit**. The energy was electric, the coffee was flowing, and the talent was undeniable. But what struck us most wasn't just the code — it was the sheer determination of students to solve real-world problems, often overcoming significant logistical and financial hurdles to simply be in the room.\n\n\nThrough our [GitLab for Education program](https://about.gitlab.com/solutions/education/), we aim to empower the next generation of developers with tools and opportunity. Here is a look at what the students built, and how they used GitLab to bridge the gap between idea and reality.\n\n## The challenge: Build faster, build securely\n\nThe premise for the GitLab track of the hackathon was simple: Don't just show us a product; show us how you built it. We wanted to see how students utilized GitLab's platform — from Issue Boards to CI/CD pipelines — to accelerate the development lifecycle.\n\nThe results were inspiring.\n\n## The winners\n\n### 1st place: Team Decode — Democratizing Scientific Research\n\n**Project:** FIRE (Fast Integrated Research Environment)\n\nTeam Decode took home the top prize with a solution that warms a developer's heart: a local-first, blazing-fast data processing tool built with [Rust](https://about.gitlab.com/blog/secure-rust-development-with-gitlab/) and Tauri. They identified a massive pain point for data science students: existing tools are fragmented, slow, and expensive.\n\nTheir solution, FIRE, allows researchers to visualize complex formats (like NetCDF) instantly. What impressed the judges most was their \"hacker\" ethos. They didn't just build a tool; they built it to be open and accessible.\n\n**How they used GitLab:** Since the team lived far apart, asynchronous communication was key. They utilized **GitLab Issue Boards** and **Milestones** to track progress and integrated their repo with Telegram to get real-time push notifications. As one team member noted, \"Coordinating all these technologies was really difficult, and what helped us was GitLab... the Issue Board really helped us track who was doing what.\"\n\n![Team Decode](https://res.cloudinary.com/about-gitlab-com/image/upload/v1767380253/epqazj1jc5c7zkgqun9h.jpg)\n\n### 2nd place: Team BichdeHueDost — Reuniting to Solve Payments\n\n**Project:** SemiPay (RFID Cashless Payment for Schools)\n\nThe team name, BichdeHueDost, translates to \"Friends who have been set apart.\" It's a fitting name for a group of friends who went to different colleges but reunited to build this project. They tackled a unique problem: handling cash in schools for young children. Their solution used RFID cards backed by a blockchain ledger to ensure secure, cashless transactions for students.\n\n**How they used GitLab:** They utilized [GitLab CI/CD](https://about.gitlab.com/topics/ci-cd/) to automate the build process for their Flutter application (APK), ensuring that every commit resulted in a testable artifact. This allowed them to iterate quickly despite the \"flaky\" nature of cross-platform mobile development.\n\n![Team BichdeHueDost](https://res.cloudinary.com/about-gitlab-com/image/upload/v1767380253/pkukrjgx2miukb6nrj5g.jpg)\n\n### 3rd place: Team ZenYukti — Agentic Repository Intelligence\n\n**Project:** RepoInsight AI (AI-powered, GitLab-native intelligence platform)\n\nTeam ZenYukti impressed us with a solution that tackles a universal developer pain point: understanding unfamiliar codebases. What stood out to the judges was the tool's practical approach to onboarding and code comprehension: RepoInsight-AI automatically generates documentation, visualizes repository structure, and even helps identify bugs, all while maintaining context about the entire codebase.\n\n**How they used GitLab:** The team built a comprehensive CI/CD pipeline that showcased GitLab's security and DevOps capabilities. They integrated [GitLab's Security Templates](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/ci/templates/Security) (SAST, Dependency Scanning, and Secret Detection), and utilized [GitLab Container Registry](https://docs.gitlab.com/user/packages/container_registry/) to manage their Docker images for backend and frontend components. They created an AI auto-review bot that runs on merge requests, demonstrating an \"agentic workflow\" where AI assists in the development process itself.\n\n![Team ZenYukti](https://res.cloudinary.com/about-gitlab-com/image/upload/v1767380253/ymlzqoruv5al1secatba.jpg)\n\n## Beyond the code: A lesson in inclusion\n\nWhile the code was impressive, the most powerful moment of the event happened away from the keyboard.\n\nDuring the feedback session, we learned about the journey Team ZenYukti took to get to Mumbai. They traveled over 24 hours, covering nearly 1,800 kilometers. Because flights were too expensive and trains were booked, they traveled in the \"General Coach,\" a non-reserved, severely overcrowded carriage.\n\nAs one student described it:\n\n*\"You cannot even imagine something like this... there are no seats... people sit on the top of the train. This is what we have endured.\"*\n\nThis hit home. [Diversity, Inclusion, and Belonging](https://handbook.gitlab.com/handbook/company/culture/inclusion/) are core values at GitLab. We realized that for these students, the barrier to entry wasn't intellect or skill, it was access.\n\nIn that moment, we decided to break that barrier. We committed to reimbursing the travel expenses for the participants who struggled to get there. It's a small step, but it underlines a massive truth: **talent is distributed equally, but opportunity is not.**\n\n![hackathon class together](https://res.cloudinary.com/about-gitlab-com/image/upload/v1767380252/o5aqmboquz8ehusxvgom.jpg)\n\n### The future is bright (and automated)\n\nWe also saw incredible potential in teams like Prometheus, who attempted to build an autonomous patch remediation tool (DevGuardian), and Team Arrakis, who built a voice-first job portal for blue-collar workers using [GitLab Duo](https://about.gitlab.com/gitlab-duo/) to troubleshoot their pipelines.\n\nTo all the students who participated: You are the future. Through [GitLab for Education](https://about.gitlab.com/solutions/education/), we are committed to providing you with the top-tier tools (like GitLab Ultimate) you need to learn, collaborate, and change the world — whether you are coding from a dorm room, a lab, or a train carriage. **Keep shipping.**\n\n> :bulb: Learn more about the [GitLab for Education program](https://about.gitlab.com/solutions/education/).\n",{"slug":731,"featured":12,"template":13},"how-iit-bombay-students-code-future-with-gitlab",{"content":733,"config":741},{"title":734,"description":735,"authors":736,"heroImage":737,"date":738,"category":9,"tags":739,"body":740},"Artois University elevates research and curriculum with GitLab Ultimate for Education","Artois University's CRIL leveraged the GitLab for Education program to gain free access to Ultimate, transforming advanced research and computer science curricula.",[724],"https://res.cloudinary.com/about-gitlab-com/image/upload/v1750099203/Blog/Hero%20Images/Blog/Hero%20Images/blog-image-template-1800x945%20%2820%29_2bJGC5ZP3WheoqzlLT05C5_1750099203484.png","2025-12-10",[611,259,715],"Leading academic institutions face a critical challenge: how to provide thousands of students and researchers with industry-standard, **full-featured DevSecOps tools** without compromising institutional control. Many start with basic version control, but the modern curriculum demands integrated capabilities for planning, security, and advanced CI/CD.\n\nThe **GitLab for Education program** is designed to solve this by providing access to **GitLab Ultimate** for qualifying institutions, allowing them to scale their operations and elevate their academic offerings. \n\nThis article showcases a powerful success story from the **Centre de Recherche en Informatique de Lens (CRIL)**, a joint laboratory of **Artois University** and CNRS in France. After years of relying solely on GitLab Community Edition (CE), the university's move to GitLab Ultimate through the GitLab for Education program immediately unlocked advanced capabilities, transforming their teaching, research, and contribution workflows virtually overnight. This story demonstrates why GitLab Ultimate is essential for institutions seeking to deliver advanced computer science and research curricula.\n\n## GitLab Ultimate unlocked: Managing scale and driving academic value\n\n**Artois University's** self-managed GitLab instance is a large-scale operation, supporting nearly **3,000 users** across approximately **19,000 projects**, primarily serving computer science students and researchers. While GitLab Community Edition was robust, the upgrade to GitLab Ultimate provided the sophisticated tooling necessary for managing this scale and facilitating advanced university-level work.\n\n***\"We can see the difference,\" says Daniel Le Berre, head of research at CRIL and the instance maintainer. \"It's a completely different product. Each week reveals new features that directly enhance our productivity and teaching.\"***\n\nThe institution joined the GitLab for Education program specifically because it covers both **instructional and non-commercial research use cases** and offers full access to Ultimate's features, removing significant cost barriers.\n\n### Key GitLab Ultimate benefits for students and researchers\n\n* **Advanced project management at scale:** Master's students now benefit from **GitLab Ultimate's project planning features**. This enables them to structure, track, and manage complex, long-term research projects using professional methodologies like portfolio management and advanced issue tracking that seamlessly roll up across their thousands of projects.\n\n* **Enhanced visibility:** Features like improved dashboards and code previews directly in Markdown files dramatically streamline tracking and documentation review, reducing administrative friction for both instructors and students managing large project loads.\n\n## Comprehensive curriculum: From concepts to continuous delivery\n\nGitLab Ultimate is deeply integrated into the computer science curriculum, moving students beyond simple `git` commands to practical **DevSecOps implementation**.\n\n* **Git fundamentals:** Students begin by visualizing concepts using open-source tools to master Git concepts.\n\n* **Full CI/CD implementation:** Students use GitLab CI for rigorous **Test-Driven Development (TDD)** in their software projects. They learn to build, test, and perform quality assurance using unit and integration testing pipelines—core competency made seamless by the integrated platform.\n\n* **DevSecOps for research and documentation:** The university teaches students that DevSecOps principles are vital for all collaborative work. Inspired by earlier work in Delft, students manage and produce critical research documentation (PDFs from Markdown files) using GitLab, incorporating quality checks like linters and spell checks directly in the CI pipeline. This ensures high-quality, reproducible research output.\n\n* **Future-proofing security skills:** The GitLab Ultimate platform immediately positions the institution to incorporate advanced DevSecOps features like SAST and DAST scanning as their research and development code projects grow, ensuring students are prepared for industry security standards.\n\n## Accelerating open source contributions with GitLab Duo\n\nAccess to the full GitLab platform, including our AI capabilities, has empowered students to make impactful contributions to the wider open source community faster than ever before.\n\nTwo Master's students recently completed direct contributions to the GitLab product, adding the **ORCID identifier** into user profiles. Working on GitLab.com, they leveraged **GitLab Duo's AI chat and code suggestions** to navigate the codebase efficiently.\n\n***\"This would not have been possible without GitLab Duo,\" Daniel Le Berre notes. \"The AI features helped students, who might have lacked deep codebase knowledge, deliver meaningful contributions in just two weeks.\"***\n\nThis demonstrates how providing students with cutting-edge tools **accelerates their learning and impact**, allowing them to translate classroom knowledge into real-world contributions immediately.\n\n## Empowering open research and institutional control\n\nThe stability of the self-managed instance at Artois University is key to its success. This model guarantees **institutional control and stability** — a critical factor for long-term research preservation.\n\nThe institution's expertise in this area was recently highlighted in a major 2024 study led by CRIL, titled: \"[Higher Education and Research Forges in France - Definition, uses, limitations encountered and needs analysis](https://hal.science/hal-04208924v4)\" ([Project on GitLab](https://gitlab.in2p3.fr/coso-college-codes-sources-et-logiciels/forges-esr-en)). The research found that the vast majority of public forges in French Higher Education and Research relied on **GitLab**. This finding underscores the consensus among academic leaders that self-hosted solutions are essential for **data control and longevity**, especially when compared to relying on external, commercial forges.\n\n## Unlock GitLab Ultimate for your institution today\n\nThe success story of **Artois University's CRIL** proves the transformative power of the GitLab for Education program. By providing **free access to GitLab Ultimate**, we enable large-scale institutions to:\n\n1.  **Deliver a modern, integrated DevSecOps curriculum.**\n\n2.  **Support advanced, collaborative research projects with Ultimate planning features.**\n\n3.  **Empower students to make AI-assisted open source contributions.**\n\n4.  **Maintain institutional control and data longevity.**\n\nIf your academic institution is ready to equip its students and researchers with the complete DevSecOps platform and its most advanced features, we invite you to join the program.\n\nThe program provides **free access to GitLab Ultimate** for qualifying instructional and non-commercial research use cases.\n\n**Apply now [online](https://about.gitlab.com/solutions/education/join/).**\n",{"slug":742,"featured":28,"template":13},"artois-university-elevates-curriculum-with-gitlab-ultimate-for-education",{"promotions":744},[745,759,770],{"id":746,"categories":747,"header":749,"text":750,"button":751,"image":756},"ai-modernization",[748],"ai-ml","Is AI achieving its promise at scale?","Quiz will take 5 minutes or less",{"text":752,"config":753},"Get your AI maturity score",{"href":754,"dataGaName":755,"dataGaLocation":241},"/assessments/ai-modernization-assessment/","modernization assessment",{"config":757},{"src":758},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138786/qix0m7kwnd8x2fh1zq49.png",{"id":760,"categories":761,"header":762,"text":750,"button":763,"image":767},"devops-modernization",[715,557],"Are you just managing tools or shipping innovation?",{"text":764,"config":765},"Get your DevOps maturity score",{"href":766,"dataGaName":755,"dataGaLocation":241},"/assessments/devops-modernization-assessment/",{"config":768},{"src":769},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138785/eg818fmakweyuznttgid.png",{"id":771,"categories":772,"header":774,"text":750,"button":775,"image":779},"security-modernization",[773],"security","Are you trading speed for security?",{"text":776,"config":777},"Get your security maturity score",{"href":778,"dataGaName":755,"dataGaLocation":241},"/assessments/security-modernization-assessment/",{"config":780},{"src":781},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138786/p4pbqd9nnjejg5ds6mdk.png",{"header":783,"blurb":784,"button":785,"secondaryButton":790},"Start building faster today","See what your team can do with the intelligent orchestration platform for DevSecOps.\n",{"text":786,"config":787},"Get your free trial",{"href":788,"dataGaName":48,"dataGaLocation":789},"https://gitlab.com/-/trial_registrations/new?glm_content=default-saas-trial&glm_source=about.gitlab.com/","feature",{"text":493,"config":791},{"href":52,"dataGaName":53,"dataGaLocation":789},1773350823123]