Silencing my Dell T340 – Part 3

At long last, part 3 of my journey to try and cool my T340 with out having to listen to a hair dryer.

Here is part 1 and part 2 if you’re curious about what I’ve done so far.

I ended up getting a 3D Printer sometime after I wrote part 2 and one of the projects I had in mind was designing and printing a shroud that I could attach fan(s) to and slide over top of the heatsink in my T340 to create a better seal for airflow and get rid of the zap strap solution from part 2.

I was hoping that having a proper shroud would increase cooling efficiency, unfortunately I don’t think it did much for my overall temperatures BUT it did make it so my fan is now easily replaceable and just slides overtop the heatsink. More on that later (or just scroll to the bottom).

Here is what I came up with:

It’s hard to tell in the photos but there is a tiny lip at the bottom that snugly tucks over the base of the heatsink to prevent the whole shroud from just sliding off over time.

I used the rubber fan holders that Noctua includes with their fans and they fit very nicely in the holes. If you’re going to use a different fan I can’t guarantee the screw holes will hold up to standard case fan screws. A M4 screw and nut should work just fine though.

When mounting the fan be very careful. I printed at 0.3mm layer height and found that if I yanked too hard when installing/removing the rubber stoppers the layers would peel apart. This might be solved by printing at 0.2mm.

Here it is installed:

I used a Noctua NF-A9 PWM (92mm*92mm*25mm). I originally planned to buy two and set them up in a push/pull configuration but Amazon sold out. Turns out this was lucky for me because it appears Dells engineers left a really sweet hunk of plastic sticking up from the motherboard which prevents mounting a 25mm thick fan to the back of the shroud:

I see Noctua sells 92mm*92mm*14mm fans that might fit in there. If someone wants to donate two I will totally update the shroud design with two fan mounts and post an update. Based on my reading I don’t think a push/pull setup will benefit overall temperatures much though since this heatsink is pretty small and has a simple design.

Ok, what you probably care about, was there a performance improvement in cooling over my original zap strap design? Possibly.

I say possibly because I stupidly didn’t blow out my server of dust before starting all of this. I ended up blowing the dust out during some size checks but before installing the shroud. Here are my recorded temperatures:

  1. Transcoding a Bluray, all CPU workload with the old cooling setup, average temperature of 80c
  2. I blew the dust out of the case. You can see I ended up dropping my average idle load temperatures by 5c
  3. Point where I installed the new shroud
  4. Transcoding a Bluray, all CPU workload with the shroud installed, there is a 15c drop compared to (1) at an average temperature of 65c. This is probably partially the shroud and partially blowing out all the dust

Another discrepancy between (1) and (4) is the fan itself. Originally I installed a NF-B9 redux-1600 PWM which only runs at 1600RPM and pushes 64.3m3/h of air. The new fan is a NF-A9 PWM that runs at 2000RPM and pushes 78.9m3/h of air.

All that being said, I’m happy with ~65c at peak load and I can’t hear a thing. Idle temps seem to be roughly the same.

Now for what you’re probably here for, the STL file: Dell T340 Heatsink Shroud v1.6

You can also find it on Thingiverse.

I printed at 0.3mm. I’d recommend doing 0.2mm to hopefully make it a bit stronger so you don’t have to be as careful when installing the fan. 100% infill. You might also want to rotate the print so the fan screw holes are flat on the bed.

Alternatively you can skip ALL of this and try CJ’s suggestion he recently posted on my Part 1 which is a BIOS setting change.

Update 2022-08-12 – Here is the last 365 days of temperatures. The spike to 72c is likely the CPU under 100% load for a sustained amount of time. I think my Cookie Clicker VM was causing it.

Upgrading vSphere 6.7 to 7.0 using the Dell custom ISO

Took the plunge today and upgraded my homelab from vSphere 6.7 (Dell custom ISO) to vSphere 7.0 (again, Dell custom ISO).

My first attempted failed due to some dependency problems:

The main take away from this is:

QLC_bootbank_qedf_1.2.24.6-1OEM.600.0.0.2768847
QLC_bootbank_scsi-qedil_1.0.22.0-1OEM.600.0.0.2494585

I booted my node back up and enabled SSH and ran the following:

esxcli software vib list |grep qed

Which provided me a list of packages that included “qed”. I was able to quickly identify the packages with matching version numbers and then remove them:

esxcli software vib remove --vibname=qedf
esxcli software vib remove --vibname=scsi-qedil

After that I was able to reboot and perform the upgrade.

These appear to be QLogic drivers that likely came with the vSphere 6.7 Dell ISO and have since been dropped or replaced on the vSphere 7.o ISO. I don’t use any QLogic hardware in my server so removing them didn’t pose much of a risk to me.

Backing up a VM with a PCIe device attached to it with Veeam

In a previous post I talked about installing a Quadro P620 into my ESXi host so I could attach it to my Plex VM. This worked out great except my Veeam backups started failing.

There is a limitation in VMware vSphere where you can’t take a Snapshot of a VM with a PCIe device passed through to it.

One option is to install the Veeam Agent for the OS you’re running and use it to take guest based backups. This isn’t ideal though in my opinion. I would much rather keep my host based backups of the VM. Fortunately this is a easy solution to this problem.

Shut off the VM before taking the Veeam backup and then power it back on after the backup is complete.

To get this working you need to install the VMware PowerShell Module on your Veeam server. To do this perform the following steps:

  1. Right click on the PowerShell shortcut and choose ‘Run as Administrator’
  2. Run the following commands:
    Find-Module -Name VMware.PowerCLI
    Install-Module -Name VMware.PowerCLI -Scope AllUsers
    Get-Command -Module *VMWare*
    Set-PowerCLIConfiguration -Scope AllUsers -ParticipateInCeip $false -InvalidCertificateAction Ignore
  3. You should see a large list of VMware PowerShell commands output which means you’ve successfully installed the module

Next up you need to make sure your Veeam Services are running under a Service Account with the appropriate permissions in vCenter. I believe this is normally a best practice and chances are you’ve all already done this. In my case I’d installed Veeam as a local service. Don’t know why but to fix it I just flipped over the following Windows Services to run as my backup operator account which had Domain Admin, Backup Operator, Local Admin on the Veeam Server and Administrator on vSphere permissions already.

The services were:

  • Veeam Backup Enterprise Manager
  • Veeam Backup Service
  • Veeam Broker Service
  • Veeam Cloud Connect Service
  • Veeam Guest Catalog Service
  • Veeam RESTful API Service

I then rebooted my Veeam server.

I already have my vCenter service joined to my domain but I did run into an issue where single sign-on wasn’t working properly. If I attempted to connect to my vCenter server via PowerShell using “Connect-VIServer <VCENTER SERVER FQDN>” I would be prompted for credentials which shouldn’t be happening since the account I’m logged in as is an Administrator in vCenter.

Turned out I needed to add my AD Group that gives users Administrative access to the vCenter Global Permissions list:

  1. Login to vCenter as an administrator
  2. Click ‘Menu’ and ‘Administrator’
  3. Click ‘Global Permissions’
  4. Click ‘Add’
  5. Change the ‘User’ field to your domain, search for the user or security group (I recommend security groups) and select it, make sure the role is ‘Administrator’ and check ‘Propagate to children’ and click ‘Ok’

After doing this I could run “Connect-VIServer <VCENTER SERVER FQDN>” and not be prompted for credentials.

Now that all the prep-work is done we can re-configure our backup job in Veeam.

First we’re going to need two scripts, one to shutdown the VM and one to boot it back up. I’ve saved these scripts on my Veeam server in “C:\Scripts\<VM FQDN>\”

The shutdown script is “shutdown.bat”, be sure to search and replace “VCENTER FQDN” and “VM FQDN” with your values:

C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoLogo -executionpolicy bypass -Command Connect-VIServer -Server "VCENTER FQDN"; Shutdown-VMGuest -VM "VM FQDN" -Server VCENTER FQDN -Confirm:0; do{$vm=Get-VM -Name "VM FQDN"}while ($vm.PowerState -eq \"PoweredOn\")

The startup script is “startup.bat”, be sure to search and replace “VCENTER FQDN” and “VM FQDN” with your values:

C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoLogo -executionpolicy bypass -Command Connect-VIServer -Server "VCENTER FQDN"; Start-VM -VM "VM FQDN" -Server "VCENTER FQDN"; Start-Sleep -s 90

Once you’ve created these fire up the Veeam console and re-configure the VMs job:

  1. Launch Veeam
  2. Find the backup job for your VM, right click on it and choose ‘Edit’
  3. Go to ‘Storage’
  4. Click ‘Advanced’
  5. Go to ‘Scripts’
  6. Checkmark ‘Run the following script before the job:’ and select your “shutdown.bat” script
  7. Checkmark ‘Run the following script after the job:’ and select your “startup.bat” script
  8. Click ‘Ok’
  9. Click ‘Finish’
  10. Perform a test run of the job, you can monitor the start-up/shutdown in vCenter

That’s it. Minor inconvenience but it works. Hopefully vSphere 7 will allow for snapshots on VMs with pass-through devices configured.

References

Adding a Quadro P620 to my Plex VM

I currently run Plex in a CentOS 7 VM (on top of vSphere 6.7) with two 2vCPUs and 2GB of vRAM.

When I needed to transcode video to sync it to a mobile device for a trip the process takes a while and consumes a lot of CPU on the VM. I could just add more vCPUs to the VM but I have a limit on how much CPU I have to toss around and there are more efficient ways to transcode video.

I bought my Dell T340 specifically with a Xeon E-2176G CPU in it so I could take advantage of the on-board GPU to handle my transcoding work. After a bunch of back and forth with VMware, Dell and Intel it turns out that Dell did not build the T340 in a way that it can actually use the on-board GPU on my CPU. Why they offer it as a choice, I don’t know but here we are.

My next option was to purchase a video card to do the work. I did some research and came up with the Quadro P620 (specifically the PYN version) being the most affordable with the features I wanted, specifically NVENC. Added bonus, it supports HEVC (H.265) which should future-proof me for a while and allow me to eventually take advantage of this card for transcoding my Blurays to H.265, but that’s another post.

The card arrived, I installed it, enabled it for passthruough in vSphere, attached it to my Plex VM and booted it up.

I downloaded the latest nVidia driver to my VM and ran the installer (as root):

[[email protected] ~]# chmod a+x NVIDIA-Linux-x86_64-430.50.run
[[email protected] ~]# ./NVIDIA-Linux-x86_64-430.50.run

The installation was straight forward, it in fact took care of everything I needed. It automatically blacklisted the default video device for me, asked me to reboot and re-run the installer, which I did and everything almost worked.

After the drive was successfully installed I ran the nvidia tool provided with the drivers to verify things and was greeted with:

[[email protected] ~]# nvidia-smi

Unable to determine the device handle for GPU 0000:03:00.0: Unknown Error

Fortunately this issue is well documented on the internet and the quick fix was to shut down the VM and make a tweak to it’s configuration. Since I have vCenter I used the GUI to solve this problem instead of downloading the VMX file, editing it and re-uploading the VMX file for the VM:

  1. Login to vCenter
  2. Right click and choose ‘Edit Settings’ on the VM
  3. Go to ‘VM Options’ and expand ‘Advanced’
  4. Click ‘Edit Configuration’
  5. Click ‘Add Configuration Params’
  6. Enter the following without quotes:
    Name: “hypervisor.cpuid.v0”
    Value: “FALSE”
  7. Click ‘Ok’
  8. Click ‘Ok’
  9. Boot up the VM

Once the VM came back up I got the output I was expecting from nvidia-smi

[[email protected] ~]# nvidia-smi

Thu Oct 24 18:36:20 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.50       Driver Version: 430.50       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P620         Off  | 00000000:13:00.0 Off |                  N/A |
| 40%   54C    P0    N/A /  N/A |     10MiB /  2000MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

The last thing to do before testing was to make sure Plex was configured to use hardware transcoding:

  1. Login to your Plex’s WebUI
  2. Under ‘Settings’ click ‘Transcoder’
  3. Checkmark ‘Use hardware acceleration when avalible’
  4. Click ‘Save Changes’

I then gave things a quick test by trying to sync a TV show to my iPhone and then re-ran nvidia-smi:

[[email protected] ~]# nvidia-smi

Thu Oct 24 18:38:59 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.50       Driver Version: 430.50       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P620         Off  | 00000000:13:00.0 Off |                  N/A |
| 41%   57C    P0    N/A /  N/A |    177MiB /  2000MiB |     20%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     22510      C   /usr/lib/plexmediaserver/Plex Transcoder     167MiB |
+-----------------------------------------------------------------------------+

Bingo, that was it. Now. How much faster was the Quadro P620 over my Xeon E-2176G, roughly 4.5x faster.

My Plex transcoding settings are:

  • Transcoder quality: Prefer higher quality encoding
  • Background transcoding x264 preset: Medium
  • Maximum simultaneous video transcode: 4

But wait you might say, why set “Maximum simultaneous video transcode” to “4”? A Quadro P620 can only do 2?

This is why, only took a few seconds as root:

# git clone https://github.com/keylase/nvidia-patch.git
# cd nvidia-patch/
# bash patch.sh

 

 

Silencing my Dell T340 – Part 2

Update 2021-01-25: There is now a Part 3 to this project

For those of you who read my Part 1 of this project you’ll remember I said I’d try installing a better CPU heat sink once I had some cash.

Well I did and it didn’t work. This post will hopefully save someone 1.5 hours of their life.

The mission was to get a Noctua NH-D15 installed in my Dell T340

The first step was to remove the existing heat sink and left over thermal compound:

First thing I wanted to do was make sure the heat sink would fit before trying to figure out the bracket.

I tried it in the recommended configuration and it does fit but I’d lose access to my bottom PCIe 8x slot.

I then rotated the heat sink 90 degrees to a less than ideal airflow path which made it possible to JUST BARELY use the bottom PCIe 8x slot.

If I were keeping the Noctua installed I’d put a strip of electrical tape long the edge of the PCIe card in the bottom slot so if the heat sink ever made contact with the card it wouldn’t blow up.

Ok, so, the heat sink appears to barely fit. Now to mount it. This is where everything went wrong.

The pre-installed CPU socket back plate is not compatible with the Noctua heat sink. I removed the motherboard and then the OEM CPU back plate which also required removing the CPU locking assembly (the arm and bracket that hold the CPU to the motherboard socket) because it was screwed into the OEM CPU back plate.

Once I had removed the OEM CPU back plate I installed the Noctua provided one and quickly realized it wasn’t going to work. The Noctua plate does not have threaded holes so I was unable to re-attach the CPU locking assembly.

At this point there was nothing more I could do, I had to re-install the original OEM CPU back plate and OEM cooler and then put everything back in the case. In theory the motherboard is raised a fair distance from the back of the T340’s case. I could have gotten some washers and nuts and attempted to re-attach the CPU locking assembly using them and the Noctua back plate but this ultimately seemed like a bad idea.

I reached out to Noctua support just to see if they had a compatible back plate. They said “no” and that their products are not compatible with this motherboard.

I’ve also reached out to Arctic, BeQuite!, CoolMaster, Corsair, Thermaltake and Zalman to see if any of them sell a compatible cooler. So far the results are:

  • Arctic – No compatible products
  • BeQuite! – No compatible products
  • CoolMaster – Didn’t read my request properly and didn’t actually answer my question
  • Corsair – No compatible products
  • Thermaltake – No response yet
  • Zalman – No response yet

I will update this post when/if the remaining companies respond to me.

I’m just going to have to live with my current solution which does an adequate job.