Deploying and Automating Vertica Eon mode on AWS Outposts and FlashBlade
For the last year and a half I’ve been working at Pure Storage, focused on AI and Analytics applications using FlashBlade. It’s been a great way to learn and get my hands on some exciting technologies. I’ve spent the last year working closely with our partners at Vertica, and been blown away at the combination of Vertica Eon mode with FlashBlade fast object storage as the S3 communal storage pool. Recently, Pure became an AWS Outposts Service Ready partner — becoming the first hardware vendor to earn that distinction. Outposts is a great way to combine the best characteristics of cloud, on-prem, and hybrid operations. We’ve also invested in our own AWS Outpost equipment, setting it up in our edge datacenter to enable testing for our customers, our partners, and ourselves.
Of course I needed to take Outposts for a spin with Vertica. And to cut any suspense short: it works, and it works very well! Over the year working with Vertica, I’ve also developed some code to automate setup for Vertica Proof-of-Concept (PoC) testing. I recently adapted the code to also work with Outposts, and wanted to share the results here.
Getting Started
To get started, simply head over to the GitHub repository: https://github.com/microslav/vertica-poc
There’s a comprehensive README file that explains all the steps for getting set up and running the automation. The end result is a PoC environment that you can use to explore the capabilities of Vertica Eon mode running on FlashBlade communal storage.
We’ve used various incarnations of this code to set up PoCs with bare metal servers, vRealize Automation, manually configured VMs, and now AWS instances on Outposts. Like any complex automation, there may be some tweaks and adjustments needed for a particular environment. But it’s built to be flexible and should be able to adapt to a range of PoC environments with minor changes.
Quick Overview
The README file has all the gory details. So go there to geek out. But as a quick overview, here’s what the automation does:
- Create AWS instances in the Outpost. We need a small instance to coordinate the PoC and host the Management Console (MC), and three larger instances for the cluster nodes. I’ve successfully deployed instances with both the Vertica BYOL and Amazon Linux 2 AMIs. (The latter supports a broader range of instance types.)
- Configure your laptop (or jumpbox) to connect to the new MC instance and log in. The automation is set up to run with root access, and there’s a script to help open root access on the new instances. Since these run in the Outpost VPC and don’t have any external IPs (unless assigned), this is a bit more secure than running in the public cloud.
- Copy the necessary binary packages to the MC instance. These are the database and management console packages downloaded from Vertica, and optionally the RapidFile Toolkit package from Pure. (There’s also an optional TPC-DS zip file that we use for internal PoCs.)
- Edit the
vertica_poc_prep.sh
script to customize it for the environment. Then source the script on the MC instance. (Sourcing it instead of executing the script makes new variables available in the environment for Ansible to pick up later.) - Run the Ansible playbook. It’s divided into multiple roles for the different instances, and a role for configuring the FlashBlade via the MC host. The execution will pause by default after the play for each role, giving you a chance to validate that everything worked.
- Configure access to the Management Console GUI. This may require setting up SSH tunnels depending on network restrictions between where you have a browser available and where the MC instance is running. The playbook will suggest a possible tunneling command to try.
Once all those steps are completed, you should have a shiny new Vertica Eon mode cluster, using FlashBlade S3 communal storage! Now you can run your PoC, or just experiment and explore the environment.
Next Steps
It’s often better to show as well as tell, so as a follow up post I’m planning to make a demo video documenting all the steps that I described above. I’ve also created a small script to set up the ephemeral instance storage on the Outpost instances as Vertica Depot cache. I’ll go over that code in a future blog. Lastly, I plan on deploying a secondary subcluster in the AWS region and running a fully hybrid configuration. If you’re interested, please follow me and keep an eye out for the follow-up posts.
Conclusion
For me, the big conclusion is that it was simple to take what I know about running Vertica on-prem, and deploy it on Outposts at the edge. This delivers a powerful combination of capabilities: Vertica Eon mode’s powerful analytics, the AWS automation and DevOps approach running in our own datacenter, and FlashBlade’s simplicity, performance, and scalability. If you run on-prem today and want a bridge to the cloud, Outposts and FlashBlade deliver. And if you love running in AWS but have applications that need to be on-prem, Outposts and FlashBlade can be the solution.