Maintenance
VISION System Back Online (4-23-2026)
Subject: VISION Supercomputer Back Online
This email is being sent to current VISION customers.
The VISION supercomputer is back online following the extended maintenance window.
During this period, several important updates were completed to improve system reliability, consistency, and day-to-day usability:
- Infrastructure Alignment: Coordinated updates across networking, storage, and compute systems to better align the platform with its reference architecture
- GPU Compute Nodes: Firmware and driver updates to reduce configuration inconsistencies
- Lustre Storage: Improvements to fast scratch storage to provide more consistent performance across the system
- Scheduling & Storage Components: Updates to support more stable and predictable system behavior
- Operational Improvements: Additional documentation and automation to help make future maintenance faster and less disruptive
You may notice more consistent behavior and improved stability as a result of these changes. Additional refinements will continue as part of normal beta-phase operations.
Any questions should be directed to help@vision.tamus.edu .
Thank you for your patience during the maintenance period and for your continued participation and feedback.
Extended Maintenance
Update: VISION Supercomputer Maintenance Extension
Critical VISION maintenance activities are ongoing and now expected to be completed by end of day Wednesday, April 22, extending beyond the original Monday, April 20 target.
Our team is working hard to complete all necessary updates. We will provide additional clarification on changes to the maintenance window as more information becomes available.
Any questions regarding this extension should be directed to help@vision.tamus.edu.
Thank you for your continued collaboration and patience.
VISION Supercomputer Scheduled Maintenance - April 14-20, 2026
VISION Unavailable April 14-20, 2026 for System Maintenance
Status: System Unavailable-----The VISION supercomputer is scheduled for a critical, full-system maintenance and update period from April 14 through April 20. VISION will be completely unavailable for the entire duration.
This maintenance is intended to reduce future disruptions, improve system consistency, and establish a more predictable maintenance schedule.
Planned Work:
- Infrastructure: * Storage and infrastructure maintenance, including hardware remediation, to support improved performance and resiliency.
- Access Controls: File system and permissions configuration updates (such as finalized changes to NFS configuration and home directory permissions) to enforce consistent access controls.
- System Utilization: Scheduling and workload management configuration changes, including updates to Slurm project-based/group-based scheduling and converting Slurm partitions to QoS intended to improve fairness and overall system utilization.
- Redundancy: Networking-related updates, coordinated with storage changes, aimed at improving redundancy and enabling future maintenance activities.
- Operational Readiness: Monitoring and logging updates to support improved system health visibility and operational notifications.