Metalbird

Estimable
Jul 13, 2015
2
0
4,510
Hi,

since a few weeks, I experience issues with my mobile Workstation Laptop (HP EliteBook 8560w). Those usually do not even have to be when the system is running hot. Examples would be surfing the web or looking for a file using Windows Explorer.
There are five cases that happen:

1. Display Driver crash (often)
The screen turns black, without signal, and returns about five seconds later with a pop-up in the taskbar tray, saying:
Display driver “NVIDIA Windows Kernel Mode Driver, Version 353.30 “ stopped responding and has successfully recovered.
After that, this case repeats itself more frequently (every 30 seconds) until I reboot the system. When this happens, many programs either lose all data (e.g. Photoshop or Illustrator) or close completely.

2. Freeze/lockup (often)
Screen freezes / locks up. No inputs are possible. It stays like that with sudden high fan speed. Only way to get out of this state is a hard reboot via holding the power button.

3. Switches off randomly (often)
Like the former. Either it freezes and shuts off or it just shuts off. Note that this is not a Windows shutdown, but like *bam* off.

4. Stuck in BIOS (rarely)
This happens rarely after rebooting the system with the former issues. It gets stuck in the BIOS screen, the first thing you see when switching the device back on, with just the HP logo and a battery charge indicator on screen. It stays in that state until you reboot using the power button

5. Stuck before BIOS (very rarely)
Powering the system on just results in a lit power button and a running fan. The screen stays off. Stays there until you reboot.

Things I’ve done:
• Switched all RAM. Previously had 2x8GB and 1x4GB modules inside. I took them all out and tried running them separately. The issues still occur, and I doubt all RAM modules are defect at once.
• Screwed open the laptop and cleaned all vents and the entire inside.
• Virus scans. MalwareBytes did not find any issues.
• Windows Event Manager just says the device suddenly lost power. Probably when I had to hard reboot it.
• HP BIOS diagnostic tests (start up, run-in, memory and disk) were all successful.
• Memory tests all run successfully, both Windows Memory tests and the ones in BIOS.
• Completely uninstalled all nVidia Graphics drivers + accessories and reinstalled them.
• Cleared temporary data.

Maybe helpful information:
• Issues occur even when docked into a docking station with sufficient cooling and a separate screen
• I’ve got “lag spikes” in windows and in games every few seconds or minutes.
• Laptop locked up and switched off (or just froze) while running diagnostic tests, so it should not have to do with my Windows installation.
• Issues happen with all three different charging cables attached, while in the docking station or just running on the battery.
• Issues occur when device is both cold or hot.
• I still have pickup-and-return warranty on the device, so with more sufficient information, I can get them to repair it.

System info:
HP EliteBook 8560w LG663EA
Intel Core i7 2630QM / 2.0 GHz
nVidia Quadro 1000M
256 GB SSD
20/8/8/4 GB RAM (issues occur with all combinations of RAM modules)
Windows 8.1 Pro

The problem is that these issues happen very randomly. Some day they don’t even occur at all, some other day every few minutes. If the service guys can’t reproduce the issues, I get it back unrepaired.

Best way would be if someone could get me some troubleshooting tips 

Well, thanks in advance!
David
 
Solution

cdrkf

Honorable
Mar 18, 2013
37
0
10,610
This sounds like a faulty graphics card to me. The repeated driver crashes are telling (espetially as they get more frequent). I had this happen to me on an old laptop.

The only good thing for you is a machine like that probably uses an 'MXM' module for the gpu rather than having it soldered, so it should be easily replaced. The only thing I can suggest for now is try disabling the nVidia GPU in device manager (as your Core i7 should have integrated Intel HD graphics for basic tasks). If that prevents problems in light duty stuff (e.g. web / desktop) then I think you've found the culprit.
 

laptux

Estimable
Aug 10, 2015
2
0
4,510
I have almost the same problem with my HP EliteBook 8560w.

There are however two two important differences:
1) My laptop has a AMD FirePro M5950 instead of the nVidia Quadro graphics card. Also the internal Intel graphics solution has been disabled by HP so I can not switch back to the IGP. The AMD FirePro M5950 is the only working graphics adapter in my laptop.

2) My laptop has Linux as the only operating system. The problem is therefore not likely to be related to the OS because I have the same symptoms with Linux as Metalbird has with his Windows based laptop.

And some minor differences:
1) My laptop has 12gb ram
2) My SSD is a 250gb Samsung evo

Possible helpful information:
The problems occur even with minimal CPU usage. I have tried to analyse the logs of my system to see if there is helpful information. The problem is that when the system freezes everything stops working. This means that on the moment the freeze occurs nothing is added to the log files and before the event happens there is nothing suspicious that could indicate that something is wrong.

Only once I got lucky enough to get an error message that might be useful:

XML:
 mce: [Hardware Error]: CPU 4: Machine Check Ecxeption: 5 Bank 4: b200000011000402
 mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff810d8349> {futex_wait_queue_me+0x89/0x140}
 mce: [Hardware Error]: TSC 24fec9a254c
 mce: [Hardware Error]: Processor 0:206a7 TIME 1437729994 SOCKET 0 APIC microcode 18
 mce: [Hardware Error]: Run the above trough 'mcelog --ascii'
 mce: [Hardware Error]: Some CPUs didn't answer in synchronization
 mce: [Hardware Error]: Machine check: Invalid
 Kernel panic - not syncing: Fatal machine check on current CPU
 Shutting down cpus with NMI
 drm_kms_helper: panic occured, switching back to text console
 Rebooting in 30 seconds..
 ACPI MEMORY or I/O RESET_REG: 
 ACPI MEMORY or I/O RESET_REG:

I do not know if this is the complete error trace. This is the only piece that I could write down before the system turned off completely.

Things I tried so far:
1) Memory testing: I have run memtest86+ without any problems.
2) Graphics testing: I have run a few tests for benchmark video cards and that did not result in crashes.
3) Hard drive testing: The SMART statistics do not indicate there is any problem.
4) CPU testing: I have tried CPU Burn to see if the CPU might be responsible for the freezes. This does not seem to be the case, because the temperature of the CPU rises but eventually passes the task to an other core without any problem.
Only when I create 4 new CPU Burn instances to make sure all cores are running on 100% the temperature rises. Under these extreme conditions the temperature is driven up over 80 degrees before the system shuts down. When this happens the symptoms are very different compared to the lock ups / freezes. I conclude that overheating of the processor is very difficult to achieve and certainly not the source of my problem because the symptoms are not the same.
Normal temperature operating conditions vary between 55 to 65 degrees Celsius. Only under heavy load the temperature might rise up to 73 degrees Celsius.

I really have no clue what to do now.
 

cdrkf

Honorable
Mar 18, 2013
37
0
10,610


Uhm your cpu should never be getting that hot! And you shouldn't be able to get it up to 80 even with a CPU burn test. You've had this laptop for a while I'm guessing? I had issues like this with an Asus laptop- turned out that dust / fluff had build up in between the cooling fan and the heatsink grille seriously reducing ariflow. Dismantling the heatsink assembly and cleaning out all the gunk with a (spare) toothbrush made a massive difference. Not only was the machine more stable afterwards, it was also faster, as it prevented the CPU / GPU from down clocking due to temperature (note they often share the same Heatsink / fan).

I'd definitely take the cover off the cooling assembly and if needed give it a clean. Cooling is a big problem on modern laptops.
 
Solution

laptux

Estimable
Aug 10, 2015
2
0
4,510


Thank you very much for the quick response cdrkf!

I have cleaned my laptop a few weeks ago because I suspected a heating problem as possible cause for the freezes. After the cleaning the temperatures dropped about 5 degrees Celsius on average.

Currently the temperatures of my laptop while running idle (<1% load) are:
Core 0: 51˚C
Core 1: 49˚C
Core 2: 46˚C
Core 3: 51˚C

Sensor 1: 50˚C
Sensor 2: 47˚C
Sensor 3: 42˚C
Sensor 4: 30˚C

When I run regular desktop applications the processor temperatures increase to roughly 58˚C.
Only when I run a little more demanding tasks the temperature increases to 60˚C en higher.
The strange thing is however that the processor load is never really high. An other strange fact is that the fan rarely rotates on the maximum RPMs.

I agree that the temperatures are high but at the same time this is a workstation model laptop with 4 cores and a big power supply that is known to run hot.
If the temperatures are to high then why doesn't the fan blow at it's maximum potential? Could it be that the fan is actually at fault here?

I will follow your advise and take the laptop apart to clean the components as thoroughly as possible.
Hopefully this will help, but I fear that I can not improve much since the last cleaning.