MikroTik CSS326 fan installation

It was time to upgrade my networking equipment in my homelab. I needed two 24p 1GB switches with 10GB uplinks to facilitate moving my homelab into my crawl space and out of my office. As things are configured right now I’d easily saturate a 1GB uplink between the switches and since I don’t have any 10GB in my homelab yet the CSS326 fits the bill. I am replacing a single first generation Ubiquiti 24p switch with two CSS-326-254G-2S+RM’s.

After receiving my CSS-326-254G-2S+RM’s I plugged them in to test them and verify my 10GB SFP+ transceivers were working properly. While I was doing that I also hooked them into PRTG using SNMP so see what kind of monitoring data I could pull from them. I had been optimistic that these new switches would run cooler than my Ubuiqiti but that does not seem to be the case. My Ubiquiti (which has a fan) runs at about 76c. The MikroTik ran at about 70c without a fan. An improvement but not a fantastic one.

Below is 1 hour of monitoring data from the MikroTik with a single 10GB SFP+ installed in it running idle but connected to my Ubiquiti via 1GBe and the other CSS-326 via fiber.

CSS-326-254G-2S+RM – 1 hour – CPU Temperature – Average 70c
CSS-326-254G-2S+RM – 1 hour – SFP+ Temperature – Average 43c

The CSS-326-254G-2S+RM has a mount for a 40mm fan and figured I’d just buy a fan and slap it in there. Unfortunately once I popped it open I saw there was no header to connect a fan to. I did some digging and came across this video which shows a significant temperature improvement once a fan is installed and the creator helpfully pointed out where you could tap into the switches board to get power. I also found this helpful forum post that also showed the J2 header and mentioned which connector was positive (+) and which was negative (-).

“YOU DON’T NEED TO DO THIS. Mine runs in a warm rack enclosure and has for 3 years now”

– Someone I know

Using my multimeter I checked the J2 connector and it outputs 24v a few seconds after the switch boots up. From what I have read, the J2 only outputs power if you connect the included power cable to your CSS-326. If you use PoE to power the CSS-326 the J2 connector does not output any power. I did not test this but since I’m using the supplied power cables this isn’t a problem for me.

Noctua is my preferred fan manufacturer but they do not make a 24v 40mm fan which means I had to use a buck converter to drop 24v down to 12v for the Noctua NF-A4x20 I wanted to use. I will put a full parts list at the bottom of this blog post.

I did not solder to the J2 connector as my first step. I just want to show where it is before continuing. Please excuse my poor soldering skills. This is only my 3rd or 4th time seriously soldering something and my first time soldering to a PCB. Using the information I gathered, I am labelling which connectors I treated as positive (+) and negative (-). I might be wrong but it all worked in the end.

J2 before I soldered my wires
J2 after I soldered my wires

Easy step first, I mounted the fan into the CSS-326 using some M3x12mm screws, nuts and a little patience.

I then 3D printed a baffle to block off the dead space to the right of the fan if you’re looking at the switch from the front. The electrical tape is just to hold the baffle in place while I’m fiddling inside the switch. Once you put the top back on the baffle is firmly pinched in place and won’t move. As designed the baffle is overkill. It could be 3mm thinner but I don’t care about saving 6g of filament so I didn’t change it for the second switch. The STL is linked at the bottom of this post in the parts list.

Using the cables and adapters that came with the Noctua fan I was able to piece things together in a way that I would never need to touch the J2 connector again if something failed. The buck converter can be detached from the main feed connected to the J2 and the fan can be disconnected from the buck converter. If either piece ever dies it should be very simple to replace them. I specifically used the extension cable and the Y-splitter. You can toss the one labelled “Low-Noise Adapter” into your spare parts bin, we won’t be needing it.

I have labelled each connector with a number so you can see how I piece things together

I removed all of the sheathing from the cables, removed any blue/green wires because I only need the yellow and black ones and then cut off some of the connectors.

Piecing them all together they will look like this:

Numbers in brackets mean a cut

(1) – Is the small connector on the extension cable that gets removed and soldered to J2

2 – Is the large connector on the extension cable that does NOT get removed. The IN on the buck connector will plug into this.

(3) – Is the large connecter on the upper leg of the Y-splitter that gets removed and soldered to the IN on the buck converter

4 – Is the large connector on the lower leg of the Y-splitter that does NOT get removed. The fan will plug into this which is attached to the OUT on the buck connector.

(5) – Is the small connector at the base of the Y-splitter. You want to cut this so that the smaller connector remains attached to the upper leg of the Y-splitter that you removed the large connector (3) from. This then gets soldered to the OUT on the buck converter

6 – Is the fans connector, leave it alone. You will plug it into 4 when everything is done

Before soldering (1) to the J2 connector feed it through the gap in baffle and under the mainboard so you can keep it all tucked out of the way. If you don’t do this first you’ll have to remove the mainboard and baffle to do it later. There should be just enough wire to make it to the J2.

Solder it all together based on the diagram above and plug everything in except for the fan. We need to adjust the buck converter before we can plugin the fan. Odds are its default setting is too high (more than 12v) for our Noctua.

Set your multimeter for DCV at whatever setting can read higher than 20v, connect alligator clips to the OUT side of the buck converter and then connect those to your multimeter. Plug the power into the switch and check your multimeter reading. Using a flathead screw driver carefully turn the small knob on top of the blue box on the buck converter until your multimeter reads 12v.

Initial buck converter setting
Buck converter reading after a few turns

Disconnect the power from the switch, remove your multimeter and alligator clips, screw together the buck converter case and put the top back on the CSS-326. You’re done!

My final results were that the switch ended up running about 30c cooler and the SFP was 11c cooler.

CSS-326-254G-2S+RM – 1 hour – CPU Temperature – Average 40c
CSS-326-254G-2S+RM – 1 hour – SFP+ Temperature – Average 32c

I get MikroTik saving cost by not including a fan in the switch but I really wish they would have at least installed a connector on the J2 to make adding a fan an easy option.

Update – 2022-08-20

I moved my entire homelab off my old Ubiquiti switch yesterday and have some real word temperatures with actual load on the switch:

The first low section on the left was the switch idling with no load while I configured VLANs and LAGs. The gap is me unplugging it, sliding it under my Ubiquiti switch and powering it back on. The initial high temperature (45.8c) was from the Ubiquiti smothering it while I did cable swaps. I eventually removed the Ubiquiti switch and the temperatures dropped a bit.

Parts List

Buck Converter – I used a “LM2596 DC-DC Buck Converter Step Down Module Power Supply DIP Output 1.25V-30V 3A”. There are a ton of these on Amazon. Here is a non-referral link to a 10pack I bought.

Noctua NF-A4x20 – Since there is plenty of room in the case I went with the 40mm * 20mm version of this fan to get the most air movement possible.

Buck Converter Case – I printed one of these to insulate the buck converter from the chassis of the switch.

Baffle – Completely optional but I designed and printed one of these to block off the section of the case to the right of the fan mount. Seemed pointless to circulate that air since there are no electronics in there except the buck converter.

Silencing my Dell T340 – Part 3

At long last, part 3 of my journey to try and cool my T340 with out having to listen to a hair dryer.

Here is part 1 and part 2 if you’re curious about what I’ve done so far.

I ended up getting a 3D Printer sometime after I wrote part 2 and one of the projects I had in mind was designing and printing a shroud that I could attach fan(s) to and slide over top of the heatsink in my T340 to create a better seal for airflow and get rid of the zap strap solution from part 2.

I was hoping that having a proper shroud would increase cooling efficiency, unfortunately I don’t think it did much for my overall temperatures BUT it did make it so my fan is now easily replaceable and just slides overtop the heatsink. More on that later (or just scroll to the bottom).

Here is what I came up with:

It’s hard to tell in the photos but there is a tiny lip at the bottom that snugly tucks over the base of the heatsink to prevent the whole shroud from just sliding off over time.

I used the rubber fan holders that Noctua includes with their fans and they fit very nicely in the holes. If you’re going to use a different fan I can’t guarantee the screw holes will hold up to standard case fan screws. A M4 screw and nut should work just fine though.

When mounting the fan be very careful. I printed at 0.3mm layer height and found that if I yanked too hard when installing/removing the rubber stoppers the layers would peel apart. This might be solved by printing at 0.2mm.

Here it is installed:

I used a Noctua NF-A9 PWM (92mm*92mm*25mm). I originally planned to buy two and set them up in a push/pull configuration but Amazon sold out. Turns out this was lucky for me because it appears Dells engineers left a really sweet hunk of plastic sticking up from the motherboard which prevents mounting a 25mm thick fan to the back of the shroud:

I see Noctua sells 92mm*92mm*14mm fans that might fit in there. If someone wants to donate two I will totally update the shroud design with two fan mounts and post an update. Based on my reading I don’t think a push/pull setup will benefit overall temperatures much though since this heatsink is pretty small and has a simple design.

Ok, what you probably care about, was there a performance improvement in cooling over my original zap strap design? Possibly.

I say possibly because I stupidly didn’t blow out my server of dust before starting all of this. I ended up blowing the dust out during some size checks but before installing the shroud. Here are my recorded temperatures:

  1. Transcoding a Bluray, all CPU workload with the old cooling setup, average temperature of 80c
  2. I blew the dust out of the case. You can see I ended up dropping my average idle load temperatures by 5c
  3. Point where I installed the new shroud
  4. Transcoding a Bluray, all CPU workload with the shroud installed, there is a 15c drop compared to (1) at an average temperature of 65c. This is probably partially the shroud and partially blowing out all the dust

Another discrepancy between (1) and (4) is the fan itself. Originally I installed a NF-B9 redux-1600 PWM which only runs at 1600RPM and pushes 64.3m3/h of air. The new fan is a NF-A9 PWM that runs at 2000RPM and pushes 78.9m3/h of air.

All that being said, I’m happy with ~65c at peak load and I can’t hear a thing. Idle temps seem to be roughly the same.

Now for what you’re probably here for, the STL file: Dell T340 Heatsink Shroud v1.6

You can also find it on Thingiverse.

I printed at 0.3mm. I’d recommend doing 0.2mm to hopefully make it a bit stronger so you don’t have to be as careful when installing the fan. 100% infill. You might also want to rotate the print so the fan screw holes are flat on the bed.

Alternatively you can skip ALL of this and try CJ’s suggestion he recently posted on my Part 1 which is a BIOS setting change.

Update 2022-08-12 – Here is the last 365 days of temperatures. The spike to 72c is likely the CPU under 100% load for a sustained amount of time. I think my Cookie Clicker VM was causing it.

Upgrading vSphere 6.7 to 7.0 using the Dell custom ISO

Took the plunge today and upgraded my homelab from vSphere 6.7 (Dell custom ISO) to vSphere 7.0 (again, Dell custom ISO).

My first attempted failed due to some dependency problems:

The main take away from this is:

QLC_bootbank_qedf_1.2.24.6-1OEM.600.0.0.2768847
QLC_bootbank_scsi-qedil_1.0.22.0-1OEM.600.0.0.2494585

I booted my node back up and enabled SSH and ran the following:

esxcli software vib list |grep qed

Which provided me a list of packages that included “qed”. I was able to quickly identify the packages with matching version numbers and then remove them:

esxcli software vib remove --vibname=qedf
esxcli software vib remove --vibname=scsi-qedil

After that I was able to reboot and perform the upgrade.

These appear to be QLogic drivers that likely came with the vSphere 6.7 Dell ISO and have since been dropped or replaced on the vSphere 7.o ISO. I don’t use any QLogic hardware in my server so removing them didn’t pose much of a risk to me.

Backing up a VM with a PCIe device attached to it with Veeam

In a previous post I talked about installing a Quadro P620 into my ESXi host so I could attach it to my Plex VM. This worked out great except my Veeam backups started failing.

There is a limitation in VMware vSphere where you can’t take a Snapshot of a VM with a PCIe device passed through to it.

One option is to install the Veeam Agent for the OS you’re running and use it to take guest based backups. This isn’t ideal though in my opinion. I would much rather keep my host based backups of the VM. Fortunately this is a easy solution to this problem.

Shut off the VM before taking the Veeam backup and then power it back on after the backup is complete.

To get this working you need to install the VMware PowerShell Module on your Veeam server. To do this perform the following steps:

  1. Right click on the PowerShell shortcut and choose ‘Run as Administrator’
  2. Run the following commands:
    Find-Module -Name VMware.PowerCLI
    Install-Module -Name VMware.PowerCLI -Scope AllUsers
    Get-Command -Module *VMWare*
    Set-PowerCLIConfiguration -Scope AllUsers -ParticipateInCeip $false -InvalidCertificateAction Ignore
  3. You should see a large list of VMware PowerShell commands output which means you’ve successfully installed the module

Next up you need to make sure your Veeam Services are running under a Service Account with the appropriate permissions in vCenter. I believe this is normally a best practice and chances are you’ve all already done this. In my case I’d installed Veeam as a local service. Don’t know why but to fix it I just flipped over the following Windows Services to run as my backup operator account which had Domain Admin, Backup Operator, Local Admin on the Veeam Server and Administrator on vSphere permissions already.

The services were:

  • Veeam Backup Enterprise Manager
  • Veeam Backup Service
  • Veeam Broker Service
  • Veeam Cloud Connect Service
  • Veeam Guest Catalog Service
  • Veeam RESTful API Service

I then rebooted my Veeam server.

I already have my vCenter service joined to my domain but I did run into an issue where single sign-on wasn’t working properly. If I attempted to connect to my vCenter server via PowerShell using “Connect-VIServer <VCENTER SERVER FQDN>” I would be prompted for credentials which shouldn’t be happening since the account I’m logged in as is an Administrator in vCenter.

Turned out I needed to add my AD Group that gives users Administrative access to the vCenter Global Permissions list:

  1. Login to vCenter as an administrator
  2. Click ‘Menu’ and ‘Administrator’
  3. Click ‘Global Permissions’
  4. Click ‘Add’
  5. Change the ‘User’ field to your domain, search for the user or security group (I recommend security groups) and select it, make sure the role is ‘Administrator’ and check ‘Propagate to children’ and click ‘Ok’

After doing this I could run “Connect-VIServer <VCENTER SERVER FQDN>” and not be prompted for credentials.

Now that all the prep-work is done we can re-configure our backup job in Veeam.

First we’re going to need two scripts, one to shutdown the VM and one to boot it back up. I’ve saved these scripts on my Veeam server in “C:\Scripts\<VM FQDN>\”

The shutdown script is “shutdown.bat”, be sure to search and replace “VCENTER FQDN” and “VM FQDN” with your values:

C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoLogo -executionpolicy bypass -Command Connect-VIServer -Server "VCENTER FQDN"; Shutdown-VMGuest -VM "VM FQDN" -Server VCENTER FQDN -Confirm:0; do{$vm=Get-VM -Name "VM FQDN"}while ($vm.PowerState -eq \"PoweredOn\")

The startup script is “startup.bat”, be sure to search and replace “VCENTER FQDN” and “VM FQDN” with your values:

C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoLogo -executionpolicy bypass -Command Connect-VIServer -Server "VCENTER FQDN"; Start-VM -VM "VM FQDN" -Server "VCENTER FQDN"; Start-Sleep -s 90

Once you’ve created these fire up the Veeam console and re-configure the VMs job:

  1. Launch Veeam
  2. Find the backup job for your VM, right click on it and choose ‘Edit’
  3. Go to ‘Storage’
  4. Click ‘Advanced’
  5. Go to ‘Scripts’
  6. Checkmark ‘Run the following script before the job:’ and select your “shutdown.bat” script
  7. Checkmark ‘Run the following script after the job:’ and select your “startup.bat” script
  8. Click ‘Ok’
  9. Click ‘Finish’
  10. Perform a test run of the job, you can monitor the start-up/shutdown in vCenter

That’s it. Minor inconvenience but it works. Hopefully vSphere 7 will allow for snapshots on VMs with pass-through devices configured.

References

Adding a Quadro P620 to my Plex VM

I currently run Plex in a CentOS 7 VM (on top of vSphere 6.7) with two 2vCPUs and 2GB of vRAM.

When I needed to transcode video to sync it to a mobile device for a trip the process takes a while and consumes a lot of CPU on the VM. I could just add more vCPUs to the VM but I have a limit on how much CPU I have to toss around and there are more efficient ways to transcode video.

I bought my Dell T340 specifically with a Xeon E-2176G CPU in it so I could take advantage of the on-board GPU to handle my transcoding work. After a bunch of back and forth with VMware, Dell and Intel it turns out that Dell did not build the T340 in a way that it can actually use the on-board GPU on my CPU. Why they offer it as a choice, I don’t know but here we are.

My next option was to purchase a video card to do the work. I did some research and came up with the Quadro P620 (specifically the PYN version) being the most affordable with the features I wanted, specifically NVENC. Added bonus, it supports HEVC (H.265) which should future-proof me for a while and allow me to eventually take advantage of this card for transcoding my Blurays to H.265, but that’s another post.

The card arrived, I installed it, enabled it for passthruough in vSphere, attached it to my Plex VM and booted it up.

I downloaded the latest nVidia driver to my VM and ran the installer (as root):

[[email protected] ~]# chmod a+x NVIDIA-Linux-x86_64-430.50.run
[[email protected] ~]# ./NVIDIA-Linux-x86_64-430.50.run

The installation was straight forward, it in fact took care of everything I needed. It automatically blacklisted the default video device for me, asked me to reboot and re-run the installer, which I did and everything almost worked.

After the drive was successfully installed I ran the nvidia tool provided with the drivers to verify things and was greeted with:

[[email protected] ~]# nvidia-smi

Unable to determine the device handle for GPU 0000:03:00.0: Unknown Error

Fortunately this issue is well documented on the internet and the quick fix was to shut down the VM and make a tweak to it’s configuration. Since I have vCenter I used the GUI to solve this problem instead of downloading the VMX file, editing it and re-uploading the VMX file for the VM:

  1. Login to vCenter
  2. Right click and choose ‘Edit Settings’ on the VM
  3. Go to ‘VM Options’ and expand ‘Advanced’
  4. Click ‘Edit Configuration’
  5. Click ‘Add Configuration Params’
  6. Enter the following without quotes:
    Name: “hypervisor.cpuid.v0”
    Value: “FALSE”
  7. Click ‘Ok’
  8. Click ‘Ok’
  9. Boot up the VM

Once the VM came back up I got the output I was expecting from nvidia-smi

[[email protected] ~]# nvidia-smi

Thu Oct 24 18:36:20 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.50       Driver Version: 430.50       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P620         Off  | 00000000:13:00.0 Off |                  N/A |
| 40%   54C    P0    N/A /  N/A |     10MiB /  2000MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

The last thing to do before testing was to make sure Plex was configured to use hardware transcoding:

  1. Login to your Plex’s WebUI
  2. Under ‘Settings’ click ‘Transcoder’
  3. Checkmark ‘Use hardware acceleration when avalible’
  4. Click ‘Save Changes’

I then gave things a quick test by trying to sync a TV show to my iPhone and then re-ran nvidia-smi:

[[email protected] ~]# nvidia-smi

Thu Oct 24 18:38:59 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.50       Driver Version: 430.50       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P620         Off  | 00000000:13:00.0 Off |                  N/A |
| 41%   57C    P0    N/A /  N/A |    177MiB /  2000MiB |     20%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     22510      C   /usr/lib/plexmediaserver/Plex Transcoder     167MiB |
+-----------------------------------------------------------------------------+

Bingo, that was it. Now. How much faster was the Quadro P620 over my Xeon E-2176G, roughly 4.5x faster.

My Plex transcoding settings are:

  • Transcoder quality: Prefer higher quality encoding
  • Background transcoding x264 preset: Medium
  • Maximum simultaneous video transcode: 4

But wait you might say, why set “Maximum simultaneous video transcode” to “4”? A Quadro P620 can only do 2?

This is why, only took a few seconds as root:

# git clone https://github.com/keylase/nvidia-patch.git
# cd nvidia-patch/
# bash patch.sh