Holy Cuda 10.0, Batman

Oh my god, I just got DAE working with Cuda 10.0 - this machine has no Cuda 9 on it!

Experimentation time…

Hmm - not even optimized it for the GPU Yet - shoould boost the speed if I do (takes ages to build the required DLL)…

Getting this far has been three days worth of frustration (plus many more days worth of previous failed attempts) to get to this not even knowing if it was gonna work - I’ll stick an optimized build on tonight to see what difference it makes.

Just wiped my laptop - so thats got nuffin, I’ll have to try that next…

Well, laptop worked - just started building the fully-specced version that handle all cards up to 2080, Tthis will take hours - started at midnight on the dot, if no errors it’ll finish sometime later (possibly much later - they do warn about this)

do you have a git for this project?

https://github.com/tensorflow/tensorflow but it’s very broken and fixing it was a matter of lots of trial + error with some luck finding a specific bug report that got it working. From a blind start I’ve been playing with this on and off since Beta 1.2.7 came out and just had another majot push at it cos if the Russian PC (which finally worked)

The all-cards version took four hours to compile last night (ran out of disk space at one point) but it works perfectly with Cuda10 - think is the DLL is huge so I’m doing card-specific builds ATM as the DLLs will be a lot smaller.

Hi, @Peardox!
Can i help you with test CUDA 10 with DAE?
I want to start it with 4xGPU Nvidia Quardo (quite old models but there are 4 of them, 1xK4000 and 3xK620 with 1920 CUDA cores summary) but with CUDA 9.0 works only 1 GPU :frowning:

Please tell how i can start DAE with CUDA 10

You don’t need Cuda 10 - the gory story follows

CUDA support depends on something called the Compute Capability (an NVidia term). looking your cards up they have the following values for that figue…

K620 = 5.0
k3000 = 3.0

If you look at CUDA - Wikipedia you’ll see that CUDA 9.0 to 10.0 cover your cards (10 won’t make things better)

It should also be noted that TF2 (coming sometime in the not too distant future to DAE) won’t help either owing to the age of the K3000s (now dropped completely)

This means that CUDA 9.0 should be fine as all your cards are covered by that version.

A rather annoying fact is that the Tensorflow (another bit of the puzzle) is compiled to work with specific versions of CUDA in mind and the default build for Tensorflow includes 3.5 and up which explains why your K3000s don’t work - you need a Tensorflow specially built to include 3.0

I can’t remember if I built a 3.0 inclusive Tensorflow for Cuda 9.0 - I’ll dig the one I have out later and you can try it (it’ll either work or it won’t)

What operating system are you running on? This also affects which TF you need.

1 Like

Hi and thanks for advanced reply.

Yesterday i was tested it with my 3xK620 without K4000 (~same as K3000) and GPU not used (only CPU even in DAE 1.2.7 GPU version). In my case, it works (GPU used) only with 1xK620, not with 3xK620. But it is no speed up form me, because i have CPU Ryzen Threadripper 1900x (8c16t) and same image it process for 19sec, and 1xK620 spend 29sec. I have also R9 3950x (16c32t) but i hope if 3x of 4x my Quadro will work with DAE, they all together will be faster than my CPU… And then I will be able to schedule an upgrade of my video cards fore moooore speed :slight_smile:

I working on windows 10 LTSC 1809 x64. If you are have TF for DAE builded with CUDA 3.0 (supported by my K4000) i will be happy!

Found it https://peardox.com/downloads/tensorflow_jni_gpu.zip

I can’t for the life of me remember if I included 3.0 in this one - it is at least possible…

All you need to do is extract that ZIP and stick tensorflow_jni.dll somewhere in your path

As I say, I can’t be positive this one has 3.0 in it

How do you know it’s not working in the first place? The k3000s in particular are rather old so I’m not sure how much of a boost they’re provide. Your CPU ain’t bad either. The only way to be sure what’s happening is by checking what the logs say. On your machine they will be located somewhere like c:\Users\drams.deeparteffects\debug.log (change drams to whatever your Win10 username is)

=== Idiot guide ===

Make a directory on your computer and call it something easy to remember, e.g. dae - this is now c:\dae

Now you want to add c:\dae to your path…

You can do this by going to Settings and typing in “environment” in the box marked “Find a setting”. This will give you two options - choose “Edit environment variables for your account” and you’ll get a list that looks something like this…

Click on the line that starts with “Path” in the top part of this window and click Edit and enter the path where you put tensorflow_jni.dll.(e.g. c:\dae)

Click OK to dismiss the window(s) and close the settings window

Start DAE and see if it worked

Thanks a lot! I tested it in various hardware configs:

1. config: 1xK4000 and 3xK620

debug log

//IN THIS SESSION DAE GPU USED ONLY CPU (seen by windows task manager, screenshot attached)

OS: Microsoft Windows 64Bit Version: 10
RAM: 13818MB of 16283MB available
CPU: AuthenticAMD Family 23 Model 1 Stepping 1 with 16 cores
DISK: 229922MB usable of 229922MB free of 476013MB total space
%JAVA HOME%:null
%PATH%: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\libnvvp;;C:\Program Files (x86)\AMD APP\bin\x86_64;C:\Program Files (x86)\AMD APP\bin\x86;C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Program Files\PuTTY;C:\Program Files\nodejs;C:\Program Files\Git\cmd;C:\Program Files (x86)\ATI Technologies\ATI.ACE\Core-Static;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Users\user\AppData\Local\Microsoft\WindowsApps;C:\Users\user\AppData\Roaming\npm;C:\Program Files\JetBrains\PhpStorm 2019.2.5\bin;;C:\DAE;;C:\Users\user\AppData\Roaming\DeepArtEffects.\jre\bin

2022-02-05 21:24:52,001 INFO c.d.d.a [main] Running on 64bit - Windows 10!
2022-02-05 21:24:52,001 INFO c.d.d.a [main] JAVA: 11.0.11 - 64bit JVM
2022-02-05 21:24:52,017 DEBUG c.d.d.a [main] User Agent: Deep-Art-Effects-Desktop/1.2.7 (windows)
2022-02-05 21:24:52,156 DEBUG c.d.d.u.d [main] loading english!
2022-02-05 21:24:52,171 DEBUG c.d.d.u.d [main] LOADING LANGUGAGE FROM: /lang/en_US.properties
2022-02-05 21:24:52,171 DEBUG c.d.d.u.d [main] LOADING LANGUGAGE FROM: /lang/ru_RU.properties
2022-02-05 21:24:52,171 DEBUG c.d.r.r.a.b [main] initializing OpenCV…
2022-02-05 21:24:53,072 DEBUG c.d.r.r.a.b [main] initialized OpenCV!
2022-02-05 21:24:53,072 DEBUG c.d.r.c [main] initializing renderingManager…
2022-02-05 21:24:53,072 DEBUG c.d.r.c [Thread-2] Initializing DeepArtEffects Core…
2022-02-05 21:24:53,072 DEBUG c.d.r.c [main] RenderingManager initialized.
2022-02-05 21:24:55,086 DEBUG c.d.d.g.c [JavaFX Application Thread] initializing Window…
2022-02-05 21:24:55,391 INFO c.d.r.c [Thread-2] DeepArtEffects Core initialized!
2022-02-05 21:24:59,622 DEBUG c.d.d.g.c [JavaFX Application Thread] initializing GUI…
2022-02-05 21:24:59,927 DEBUG c.d.d.g.c [JavaFX Application Thread] GUI initialized!
2022-02-05 21:24:59,938 DEBUG c.d.d.g.c [JavaFX Application Thread] Window initialized!
2022-02-05 21:25:00,209 DEBUG c.d.d.g.c [JavaFX Application Thread] checking License…
2022-02-05 21:25:00,209 DEBUG c.d.d.g.c [JavaFX Application Thread] TRID: d6b153f9e91637a8
2022-02-05 21:25:00,845 INFO c.d.d.g.c [Thread-8] newest version: true
2022-02-05 21:25:00,983 DEBUG c.d.d.c.a [Thread-7] tracked
2022-02-05 21:25:01,425 DEBUG c.d.d.c.a [Thread-6] track w/ param
2022-02-05 21:25:02,342 INFO c.d.d.u.c [JavaFX Application Thread] null
2022-02-05 21:25:10,576 DEBUG c.d.d.b.a [JavaFX Application Thread] opening image from file: C:\Users\user\Desktop\DAE\pexels-nuta-sorokina-10220117.jpg
2022-02-05 21:25:16,807 DEBUG c.d.d.b.a [JavaFX Application Thread] successfully opened image from file!
2022-02-05 21:25:17,375 DEBUG c.d.d.c.a [Thread-10] tracked
2022-02-05 21:25:51,372 DEBUG c.d.r.b [JavaFX Application Thread] starting image renderer!
2022-02-05 21:25:51,372 DEBUG c.d.r.d [JavaFX Application Thread] using ARTFILTER… checking if available!
2022-02-05 21:25:51,373 DEBUG c.d.r.d [JavaFX Application Thread] Style not available… downloading!
2022-02-05 21:25:51,374 DEBUG c.d.r.c [JavaFX Application Thread] core is initialized!
2022-02-05 21:25:51,379 DEBUG c.d.r.s.ArtFilter [Thread-44] downloading style c7985851-1560-11e7-afe2-06d95fe194ed (Eye)
2022-02-05 21:25:52,452 DEBUG c.d.r.d [Thread-44] waiting for download to finish…
2022-02-05 21:25:52,477 DEBUG c.d.r.c [JavaFX Application Thread] checking if core is initialized…
2022-02-05 21:25:52,477 DEBUG c.d.r.c [JavaFX Application Thread] core is initialized!
2022-02-05 21:25:52,477 DEBUG c.d.r.c [JavaFX Application Thread] preparing progressview…
2022-02-05 21:25:52,478 DEBUG c.d.r.c [JavaFX Application Thread] rendering ARTFILTER
2022-02-05 21:25:52,480 DEBUG c.d.r.c [Thread-49] session not yet available - waiting…
2022-02-05 21:25:52,532 DEBUG c.d.r.c [Thread-49] Session available!
2022-02-05 21:25:52,532 DEBUG c.d.r.c [Thread-49] rendering now…
2022-02-05 21:25:52,533 DEBUG c.d.r.d [Thread-44] succesfully downloaded style!
2022-02-05 21:25:52,753 DEBUG c.d.c.s.d [Thread-49] Source width: 1536, height: 1920
2022-02-05 21:25:52,757 DEBUG c.d.c.s.d [Thread-49] x: 0, y: 0, width: 512, height: 512
2022-02-05 21:25:52,766 DEBUG c.d.c.s.d [Thread-49] x: 0, y: 359, width: 512, height: 512
2022-02-05 21:25:52,767 DEBUG c.d.c.s.d [Thread-49] x: 0, y: 718, width: 512, height: 512
2022-02-05 21:25:52,768 DEBUG c.d.c.s.d [Thread-49] x: 0, y: 1077, width: 512, height: 512
2022-02-05 21:25:52,770 DEBUG c.d.c.s.d [Thread-49] x: 0, y: 1436, width: 512, height: 484
2022-02-05 21:25:52,771 DEBUG c.d.c.s.d [Thread-49] x: 359, y: 0, width: 512, height: 512
2022-02-05 21:25:52,772 DEBUG c.d.c.s.d [Thread-49] x: 359, y: 359, width: 512, height: 512
2022-02-05 21:25:52,774 DEBUG c.d.c.s.d [Thread-49] x: 359, y: 718, width: 512, height: 512
2022-02-05 21:25:52,775 DEBUG c.d.c.s.d [Thread-49] x: 359, y: 1077, width: 512, height: 512
2022-02-05 21:25:52,776 DEBUG c.d.c.s.d [Thread-49] x: 359, y: 1436, width: 512, height: 484
2022-02-05 21:25:52,778 DEBUG c.d.c.s.d [Thread-49] x: 718, y: 0, width: 512, height: 512
2022-02-05 21:25:52,779 DEBUG c.d.c.s.d [Thread-49] x: 718, y: 359, width: 512, height: 512
2022-02-05 21:25:52,780 DEBUG c.d.c.s.d [Thread-49] x: 718, y: 718, width: 512, height: 512
2022-02-05 21:25:52,782 DEBUG c.d.c.s.d [Thread-49] x: 718, y: 1077, width: 512, height: 512
2022-02-05 21:25:52,783 DEBUG c.d.c.s.d [Thread-49] x: 718, y: 1436, width: 512, height: 484
2022-02-05 21:25:52,785 DEBUG c.d.c.s.d [Thread-49] x: 1077, y: 0, width: 459, height: 512
2022-02-05 21:25:52,786 DEBUG c.d.c.s.d [Thread-49] x: 1077, y: 359, width: 459, height: 512
2022-02-05 21:25:52,787 DEBUG c.d.c.s.d [Thread-49] x: 1077, y: 718, width: 459, height: 512
2022-02-05 21:25:52,789 DEBUG c.d.c.s.d [Thread-49] x: 1077, y: 1077, width: 459, height: 512
2022-02-05 21:25:52,790 DEBUG c.d.c.s.d [Thread-49] x: 1077, y: 1436, width: 459, height: 484
2022-02-05 21:25:53,442 DEBUG c.d.d.c.a [Thread-48] track w/ param
2022-02-05 21:25:53,974 DEBUG c.d.c.a [Thread-49] Render tile 1 of 20

// CUT LOG BECAUSE FORUM LIMIT 32000 CHARS

2022-02-05 21:26:11,454 DEBUG c.d.c.a [Thread-49] Render tile 20 of 20
2022-02-05 21:26:11,456 DEBUG c.d.c.s.e [Thread-49] Start blending
2022-02-05 21:26:12,318 DEBUG c.d.c.s.e [Thread-49] Blender feeding: 859 ms
2022-02-05 21:26:12,372 DEBUG c.d.c.s.e [Thread-49] Blending: 53 ms
2022-02-05 21:26:12,390 DEBUG c.d.c.s.e [Thread-49] Writing: 18 ms
2022-02-05 21:26:12,395 DEBUG c.d.r.c [Thread-49] rendering finished…
2022-02-05 21:26:12,404 DEBUG c.d.r.c [JavaFX Application Thread] Image rendered!

[i forget make a windows task manager screenshot, sorry, but i saw 0% of all of GPUs used honestly :slight_smile: ]

Result: only CPU used :frowning:


2. config: 3xK620 only

debug log

//IN THIS SESSION DAE GPU USED GPU, BUT ONLY 1 OF 3 gpus installed (seen by windows task manager, screenshot attached)

2022-02-05 21:30:09,021 INFO c.d.d.u.c [main] verbosing output!
2022-02-05 21:30:09,021 INFO c.d.d.a [main] Starting Deep Art Effects for Desktop - GPU
2022-02-05 21:30:09,289 DEBUG o.s.o.w.WindowsOperatingSystem [main] Debug privileges not enabled.
2022-02-05 21:30:09,584 INFO c.d.d.a [main]

OS: Microsoft Windows 64Bit Version: 10
RAM: 13839MB of 16283MB available
CPU: AuthenticAMD Family 23 Model 1 Stepping 1 with 16 cores
DISK: 229875MB usable of 229875MB free of 476013MB total space
%JAVA HOME%:null
%PATH%: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\libnvvp;;C:\Program Files (x86)\AMD APP\bin\x86_64;C:\Program Files (x86)\AMD APP\bin\x86;C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Program Files\PuTTY;C:\Program Files\nodejs;C:\Program Files\Git\cmd;C:\Program Files (x86)\ATI Technologies\ATI.ACE\Core-Static;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Users\user\AppData\Local\Microsoft\WindowsApps;C:\Users\user\AppData\Roaming\npm;C:\Program Files\JetBrains\PhpStorm 2019.2.5\bin;;C:\DAE;;C:\Users\user\AppData\Roaming\DeepArtEffects.\jre\bin

2022-02-05 21:30:09,584 INFO c.d.d.a [main] Running on 64bit - Windows 10!
2022-02-05 21:30:09,584 INFO c.d.d.a [main] JAVA: 11.0.11 - 64bit JVM
2022-02-05 21:30:09,615 DEBUG c.d.d.a [main] User Agent: Deep-Art-Effects-Desktop/1.2.7 (windows)
2022-02-05 21:30:09,756 DEBUG c.d.d.u.d [main] loading english!
2022-02-05 21:30:09,772 DEBUG c.d.d.u.d [main] LOADING LANGUGAGE FROM: /lang/en_US.properties
2022-02-05 21:30:09,772 DEBUG c.d.d.u.d [main] LOADING LANGUGAGE FROM: /lang/ru_RU.properties
2022-02-05 21:30:09,772 DEBUG c.d.r.r.a.b [main] initializing OpenCV…
2022-02-05 21:30:10,486 DEBUG c.d.r.r.a.b [main] initialized OpenCV!
2022-02-05 21:30:10,486 DEBUG c.d.r.c [main] initializing renderingManager…
2022-02-05 21:30:10,486 DEBUG c.d.r.c [Thread-2] Initializing DeepArtEffects Core…
2022-02-05 21:30:10,502 DEBUG c.d.r.c [main] RenderingManager initialized.
2022-02-05 21:30:12,805 DEBUG c.d.d.g.c [JavaFX Application Thread] initializing Window…
2022-02-05 21:30:16,402 INFO c.d.r.c [Thread-2] DeepArtEffects Core initialized!
2022-02-05 21:30:17,523 DEBUG c.d.d.g.c [JavaFX Application Thread] initializing GUI…
2022-02-05 21:30:17,820 DEBUG c.d.d.g.c [JavaFX Application Thread] GUI initialized!
2022-02-05 21:30:17,831 DEBUG c.d.d.g.c [JavaFX Application Thread] Window initialized!
2022-02-05 21:30:18,118 DEBUG c.d.d.g.c [JavaFX Application Thread] checking License…
2022-02-05 21:30:18,119 DEBUG c.d.d.g.c [JavaFX Application Thread] TRID: d6b153f9e91637a8
2022-02-05 21:30:18,813 INFO c.d.d.g.c [Thread-8] newest version: true
2022-02-05 21:30:18,893 DEBUG c.d.d.c.a [Thread-7] tracked
2022-02-05 21:30:19,533 DEBUG c.d.d.c.a [Thread-6] track w/ param
2022-02-05 21:30:36,449 DEBUG c.d.d.c.a [Thread-9] tracked
2022-02-05 21:30:41,037 INFO c.d.d.u.c [main] verbosing output!
2022-02-05 21:30:41,037 INFO c.d.d.a [main] Starting Deep Art Effects for Desktop - GPU
2022-02-05 21:30:41,269 DEBUG o.s.o.w.WindowsOperatingSystem [main] Debug privileges not enabled.
2022-02-05 21:30:41,462 INFO c.d.d.a [main]

OS: Microsoft Windows 64Bit Version: 10
RAM: 13845MB of 16283MB available
CPU: AuthenticAMD Family 23 Model 1 Stepping 1 with 16 cores
DISK: 229866MB usable of 229866MB free of 476013MB total space
%JAVA HOME%:null
%PATH%: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\libnvvp;;C:\Program Files (x86)\AMD APP\bin\x86_64;C:\Program Files (x86)\AMD APP\bin\x86;C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Program Files\PuTTY;C:\Program Files\nodejs;C:\Program Files\Git\cmd;C:\Program Files (x86)\ATI Technologies\ATI.ACE\Core-Static;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Users\user\AppData\Local\Microsoft\WindowsApps;C:\Users\user\AppData\Roaming\npm;C:\Program Files\JetBrains\PhpStorm 2019.2.5\bin;;C:\DAE;;C:\Users\user\AppData\Roaming\DeepArtEffects.\jre\bin

2022-02-05 21:30:41,462 INFO c.d.d.a [main] Running on 64bit - Windows 10!
2022-02-05 21:30:41,462 INFO c.d.d.a [main] JAVA: 11.0.11 - 64bit JVM
2022-02-05 21:30:41,493 DEBUG c.d.d.a [main] User Agent: Deep-Art-Effects-Desktop/1.2.7 (windows)
2022-02-05 21:30:41,633 DEBUG c.d.d.u.d [main] loading english!
2022-02-05 21:30:41,633 DEBUG c.d.d.u.d [main] LOADING LANGUGAGE FROM: /lang/en_US.properties
2022-02-05 21:30:41,633 DEBUG c.d.d.u.d [main] LOADING LANGUGAGE FROM: /lang/en_US.properties
2022-02-05 21:30:41,633 DEBUG c.d.r.r.a.b [main] initializing OpenCV…
2022-02-05 21:30:42,319 DEBUG c.d.r.r.a.b [main] initialized OpenCV!
2022-02-05 21:30:42,319 DEBUG c.d.r.c [main] initializing renderingManager…
2022-02-05 21:30:42,319 DEBUG c.d.r.c [Thread-2] Initializing DeepArtEffects Core…
2022-02-05 21:30:42,319 DEBUG c.d.r.c [main] RenderingManager initialized.
2022-02-05 21:30:42,661 DEBUG c.d.d.g.c [JavaFX Application Thread] initializing Window…
2022-02-05 21:30:45,028 INFO c.d.r.c [Thread-2] DeepArtEffects Core initialized!
2022-02-05 21:30:47,036 DEBUG c.d.d.g.c [JavaFX Application Thread] initializing GUI…
2022-02-05 21:30:47,313 DEBUG c.d.d.g.c [JavaFX Application Thread] GUI initialized!
2022-02-05 21:30:47,325 DEBUG c.d.d.g.c [JavaFX Application Thread] Window initialized!
2022-02-05 21:30:47,575 DEBUG c.d.d.g.c [JavaFX Application Thread] checking License…
2022-02-05 21:30:47,575 DEBUG c.d.d.g.c [JavaFX Application Thread] TRID: d6b153f9e91637a8
2022-02-05 21:30:48,220 DEBUG c.d.d.c.a [Thread-7] tracked
2022-02-05 21:30:48,379 INFO c.d.d.g.c [Thread-8] newest version: true
2022-02-05 21:30:48,690 DEBUG c.d.d.c.a [Thread-6] track w/ param
2022-02-05 21:31:00,030 INFO c.d.d.u.c [JavaFX Application Thread] C:\Users\user\Desktop\DAE
2022-02-05 21:31:01,519 DEBUG c.d.d.b.a [JavaFX Application Thread] opening image from file: C:\Users\user\Desktop\DAE\pexels-nuta-sorokina-10220117.jpg
2022-02-05 21:31:07,503 DEBUG c.d.d.b.a [JavaFX Application Thread] successfully opened image from file!
2022-02-05 21:31:08,038 DEBUG c.d.d.c.a [Thread-10] tracked
2022-02-05 21:31:08,139 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7984d3c-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:08,158 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘ed8e394f-1b90-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:08,178 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7984b32-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:08,195 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7984cac-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:08,219 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7985469-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:08,237 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7985759-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:08,251 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7985796-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:08,266 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c79857d7-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:08,283 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7985851-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:53,648 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c79859dc-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:53,963 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7985a33-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:54,244 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7985ab5-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:31:54,694 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7985b7e-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:32:04,638 DEBUG c.d.r.b [JavaFX Application Thread] starting image renderer!
2022-02-05 21:32:04,638 DEBUG c.d.r.d [JavaFX Application Thread] using ARTFILTER… checking if available!
2022-02-05 21:32:04,638 DEBUG c.d.r.d [JavaFX Application Thread] Style not available… downloading!
2022-02-05 21:32:04,639 DEBUG c.d.r.c [JavaFX Application Thread] core is initialized!
2022-02-05 21:32:04,644 DEBUG c.d.r.s.ArtFilter [Thread-34] downloading style c798497e-1560-11e7-afe2-06d95fe194ed (Abstract 3)
2022-02-05 21:32:05,795 DEBUG c.d.r.d [Thread-34] waiting for download to finish…
2022-02-05 21:32:05,822 DEBUG c.d.r.c [JavaFX Application Thread] checking if core is initialized…
2022-02-05 21:32:05,822 DEBUG c.d.r.c [JavaFX Application Thread] core is initialized!
2022-02-05 21:32:05,823 DEBUG c.d.r.c [JavaFX Application Thread] preparing progressview…
2022-02-05 21:32:05,824 DEBUG c.d.r.c [JavaFX Application Thread] rendering ARTFILTER
2022-02-05 21:32:05,826 DEBUG c.d.r.c [Thread-39] session not yet available - waiting…
2022-02-05 21:32:05,876 DEBUG c.d.r.d [Thread-34] succesfully downloaded style!
2022-02-05 21:32:05,878 DEBUG c.d.r.c [Thread-39] Session available!
2022-02-05 21:32:05,878 DEBUG c.d.r.c [Thread-39] rendering now…
2022-02-05 21:32:06,074 DEBUG c.d.c.s.d [Thread-39] Source width: 3072, height: 3840
2022-02-05 21:32:06,078 DEBUG c.d.c.s.d [Thread-39] x: 0, y: 0, width: 512, height: 512

// CUT LOG BECAUSE FORUM LIMIT 32000 CHARS

2022-02-05 21:32:06,157 DEBUG c.d.c.s.d [Thread-39] x: 2872, y: 3590, width: 200, height: 250
2022-02-05 21:32:07,487 DEBUG c.d.d.c.a [Thread-38] track w/ param
2022-02-05 21:32:10,734 DEBUG c.d.c.a [Thread-39] Render tile 1 of 99

// CUT LOG BECAUSE FORUM LIMIT 32000 CHARS

2022-02-05 21:33:12,715 DEBUG c.d.c.a [Thread-39] Render tile 99 of 99
2022-02-05 21:33:12,716 DEBUG c.d.c.s.e [Thread-39] Start blending
2022-02-05 21:33:14,420 DEBUG c.d.c.s.e [Thread-39] Blender feeding: 1702 ms
2022-02-05 21:33:14,636 DEBUG c.d.c.s.e [Thread-39] Blending: 215 ms
2022-02-05 21:33:14,699 DEBUG c.d.c.s.e [Thread-39] Writing: 62 ms
2022-02-05 21:33:14,699 DEBUG c.d.r.c [Thread-39] rendering finished…
2022-02-05 21:33:14,704 DEBUG c.d.r.c [JavaFX Application Thread] Image rendered!

Result: GPU used but only one of 3 GPUs :frowning:


3. config: 1xK400 only

debug log

//IN THIS SESSION DAE USED ONLY CPU WHEN 1 GPU K4000 INSTALLED (and C:\DAE\tensorflow_jni.dll exsists and C:\DAE included to system (user) Path)

2022-02-05 21:49:21,865 INFO c.d.d.u.c [main] verbosing output!
2022-02-05 21:49:21,865 INFO c.d.d.a [main] Starting Deep Art Effects for Desktop - GPU
2022-02-05 21:49:22,146 DEBUG o.s.o.w.WindowsOperatingSystem [main] Debug privileges not enabled.
2022-02-05 21:49:22,287 INFO c.d.d.a [main]

OS: Microsoft Windows 64Bit Version: 10
RAM: 14175MB of 16335MB available
CPU: AuthenticAMD Family 23 Model 8 Stepping 2 with 12 cores
DISK: 146837MB usable of 146837MB free of 243271MB total space
%JAVA HOME%:null
%PATH%: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\libnvvp;;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Program Files\PuTTY;C:\Program Files (x86)\WinSCP;C:\Program Files\nodejs;C:\Program Files\NVIDIA Corporation\Nsight Compute 2022.1.0;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\…\extras\CUPTI\lib64;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\libnvvp;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Program Files\PuTTY;C:\Program Files (x86)\WinSCP;C:\Program Files\nodejs;C:\Program Files\NVIDIA Corporation\Nsight Compute 2022.1.0;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Users\user2\AppData\Local\Microsoft\WindowsApps;;C:\Users\user2\AppData\Roaming\DeepArtEffectsGPU.\jre\bin

2022-02-05 21:49:22,287 INFO c.d.d.a [main] Running on 64bit - Windows 10!
2022-02-05 21:49:22,287 INFO c.d.d.a [main] JAVA: 11.0.11 - 64bit JVM
2022-02-05 21:49:22,318 DEBUG c.d.d.a [main] User Agent: Deep-Art-Effects-Desktop/1.2.7 (windows)
2022-02-05 21:49:22,427 DEBUG c.d.d.u.d [main] loading english!
2022-02-05 21:49:22,427 DEBUG c.d.d.u.d [main] LOADING LANGUGAGE FROM: /lang/en_US.properties
2022-02-05 21:49:22,427 DEBUG c.d.d.u.d [main] LOADING LANGUGAGE FROM: /lang/en_US.properties
2022-02-05 21:49:22,443 DEBUG c.d.r.r.a.b [main] initializing OpenCV…
2022-02-05 21:49:23,068 DEBUG c.d.r.r.a.b [main] initialized OpenCV!
2022-02-05 21:49:23,068 DEBUG c.d.r.c [main] initializing renderingManager…
2022-02-05 21:49:23,068 DEBUG c.d.r.c [Thread-2] Initializing DeepArtEffects Core…
2022-02-05 21:49:23,068 DEBUG c.d.r.c [main] RenderingManager initialized.
2022-02-05 21:49:24,517 DEBUG c.d.d.g.c [JavaFX Application Thread] initializing Window…
2022-02-05 21:49:25,073 INFO c.d.r.c [Thread-2] DeepArtEffects Core initialized!
2022-02-05 21:49:28,381 DEBUG c.d.d.g.c [JavaFX Application Thread] initializing GUI…
2022-02-05 21:49:28,627 DEBUG c.d.d.g.c [JavaFX Application Thread] GUI initialized!
2022-02-05 21:49:28,636 DEBUG c.d.d.g.c [JavaFX Application Thread] Window initialized!
2022-02-05 21:49:28,831 DEBUG c.d.d.g.c [JavaFX Application Thread] checking License…
2022-02-05 21:49:28,832 DEBUG c.d.d.g.c [JavaFX Application Thread] TRID: a2b4188eb8efed23
2022-02-05 21:49:29,453 INFO c.d.d.g.c [Thread-9] newest version: true
2022-02-05 21:49:29,501 DEBUG c.d.d.c.a [Thread-8] tracked
2022-02-05 21:49:30,002 DEBUG c.d.d.c.a [Thread-7] track w/ param
2022-02-05 21:49:30,201 INFO c.d.d.u.c [JavaFX Application Thread] C:\Users\user2\Desktop\DAE
2022-02-05 21:49:32,519 DEBUG c.d.d.b.a [JavaFX Application Thread] opening image from file: C:\Users\user2\Desktop\DAE\pexels-nuta-sorokina-10220117.jpg
2022-02-05 21:49:38,284 DEBUG c.d.d.b.a [JavaFX Application Thread] successfully opened image from file!
2022-02-05 21:49:38,850 DEBUG c.d.d.c.a [Thread-11] tracked
2022-02-05 21:49:39,125 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7984d3c-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:49:39,141 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘ed8e394f-1b90-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:49:39,154 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7984b32-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:49:39,168 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7984cac-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:49:39,187 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7985469-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:49:39,201 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7985759-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:49:39,216 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7985796-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:49:39,229 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c79857d7-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:49:39,240 INFO c.d.r.s.ArtFilter [JavaFX Application Thread] preview ‘c7985851-1560-11e7-afe2-06d95fe194ed’ loaded from disk!
2022-02-05 21:49:48,394 DEBUG c.d.r.c [JavaFX Application Thread] core is initialized!
2022-02-05 21:49:54,619 DEBUG c.d.r.b [JavaFX Application Thread] starting image renderer!
2022-02-05 21:49:54,619 DEBUG c.d.r.d [JavaFX Application Thread] using ARTFILTER… checking if available!
2022-02-05 21:49:54,619 DEBUG c.d.r.d [JavaFX Application Thread] Style not available… downloading!
2022-02-05 21:49:54,620 DEBUG c.d.r.c [JavaFX Application Thread] core is initialized!
2022-02-05 21:49:54,626 DEBUG c.d.r.s.ArtFilter [Thread-23] downloading style c7985851-1560-11e7-afe2-06d95fe194ed (Eye)
2022-02-05 21:49:55,434 DEBUG c.d.r.d [Thread-23] waiting for download to finish…
2022-02-05 21:49:55,451 DEBUG c.d.r.c [JavaFX Application Thread] checking if core is initialized…
2022-02-05 21:49:55,451 DEBUG c.d.r.c [JavaFX Application Thread] core is initialized!
2022-02-05 21:49:55,451 DEBUG c.d.r.c [JavaFX Application Thread] preparing progressview…
2022-02-05 21:49:55,452 DEBUG c.d.r.c [JavaFX Application Thread] rendering ARTFILTER
2022-02-05 21:49:55,454 DEBUG c.d.r.c [Thread-28] session not yet available - waiting…
2022-02-05 21:49:55,484 DEBUG c.d.r.c [Thread-28] Session available!
2022-02-05 21:49:55,484 DEBUG c.d.r.c [Thread-28] rendering now…
2022-02-05 21:49:55,494 DEBUG c.d.r.d [Thread-23] succesfully downloaded style!
2022-02-05 21:49:55,641 DEBUG c.d.c.s.d [Thread-28] Source width: 1536, height: 1920
2022-02-05 21:49:55,643 DEBUG c.d.c.s.d [Thread-28] x: 0, y: 0, width: 512, height: 512
2022-02-05 21:49:55,649 DEBUG c.d.c.s.d [Thread-28] x: 0, y: 359, width: 512, height: 512
2022-02-05 21:49:55,650 DEBUG c.d.c.s.d [Thread-28] x: 0, y: 718, width: 512, height: 512
2022-02-05 21:49:55,650 DEBUG c.d.c.s.d [Thread-28] x: 0, y: 1077, width: 512, height: 512
2022-02-05 21:49:55,651 DEBUG c.d.c.s.d [Thread-28] x: 0, y: 1436, width: 512, height: 484
2022-02-05 21:49:55,651 DEBUG c.d.c.s.d [Thread-28] x: 359, y: 0, width: 512, height: 512
2022-02-05 21:49:55,652 DEBUG c.d.c.s.d [Thread-28] x: 359, y: 359, width: 512, height: 512
2022-02-05 21:49:55,652 DEBUG c.d.c.s.d [Thread-28] x: 359, y: 718, width: 512, height: 512
2022-02-05 21:49:55,653 DEBUG c.d.c.s.d [Thread-28] x: 359, y: 1077, width: 512, height: 512
2022-02-05 21:49:55,653 DEBUG c.d.c.s.d [Thread-28] x: 359, y: 1436, width: 512, height: 484
2022-02-05 21:49:55,654 DEBUG c.d.c.s.d [Thread-28] x: 718, y: 0, width: 512, height: 512
2022-02-05 21:49:55,654 DEBUG c.d.c.s.d [Thread-28] x: 718, y: 359, width: 512, height: 512
2022-02-05 21:49:55,654 DEBUG c.d.c.s.d [Thread-28] x: 718, y: 718, width: 512, height: 512
2022-02-05 21:49:55,655 DEBUG c.d.c.s.d [Thread-28] x: 718, y: 1077, width: 512, height: 512
2022-02-05 21:49:55,655 DEBUG c.d.c.s.d [Thread-28] x: 718, y: 1436, width: 512, height: 484
2022-02-05 21:49:55,656 DEBUG c.d.c.s.d [Thread-28] x: 1077, y: 0, width: 459, height: 512
2022-02-05 21:49:55,656 DEBUG c.d.c.s.d [Thread-28] x: 1077, y: 359, width: 459, height: 512
2022-02-05 21:49:55,657 DEBUG c.d.c.s.d [Thread-28] x: 1077, y: 718, width: 459, height: 512
2022-02-05 21:49:55,657 DEBUG c.d.c.s.d [Thread-28] x: 1077, y: 1077, width: 459, height: 512
2022-02-05 21:49:55,658 DEBUG c.d.c.s.d [Thread-28] x: 1077, y: 1436, width: 459, height: 484
2022-02-05 21:49:56,625 DEBUG c.d.d.c.a [Thread-27] track w/ param
2022-02-05 21:49:56,983 DEBUG c.d.c.a [Thread-28] Render tile 1 of 20

// CUT LOG BECAUSE FORUM LIMIT 32000 CHARS

2022-02-05 21:50:16,982 DEBUG c.d.c.a [Thread-28] Render tile 20 of 20
2022-02-05 21:50:16,983 DEBUG c.d.c.s.e [Thread-28] Start blending
2022-02-05 21:50:17,701 DEBUG c.d.c.s.e [Thread-28] Blender feeding: 716 ms
2022-02-05 21:50:17,746 DEBUG c.d.c.s.e [Thread-28] Blending: 45 ms
2022-02-05 21:50:17,760 DEBUG c.d.c.s.e [Thread-28] Writing: 14 ms
2022-02-05 21:50:17,762 DEBUG c.d.r.c [Thread-28] rendering finished…
2022-02-05 21:50:17,765 DEBUG c.d.r.c [JavaFX Application Thread] Image rendered!

Result: Only CPU used :frowning:


Ok, my k4000 is a old dynozaur and it looks your TF build not included CUDA 3.0, but why 3xK620 not working together? I now want to achieve (or force) at least this

And can you explain in whitch row in the log i can see GPU or CPU used? i read it again but like a blind or stupid…

P.S. If it is important: only in 3rd test i used licensed DAE version, in 1st and 2nd test i used trial with watermarks.

P.S.2. Sorry for bold font in spoilers (debug logs) - it is forum engine funs, not my

There’s no Cuda in your logs at all - have you installed cuDNN?

You should have a directory at…
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0

You should have already downloaded cuDNN from …

One of the 9.0 versions such as Download cuDNN v7.6.5 (November 5th, 2019), for CUDA 9.0
This you should have extracted into the C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0 directory, overwriting some files with the newer ones.

Check and/or do this and if still no joy we’ll try Cuda 10.0 (again, not sure my 10 version has 3.0) but it should give you 3 gpus

In this folder cudnn files i copied (on both PCs). As said on your site. 1-2-3-4 patch and other (and reebot…)

OK - try 10 then - this will take me a while as I’ll need to set up a machine with the correct stuff on it which means the Russian PC is the only one I have available (think I’ve still got some credit on it).

As I say, this will take several hours for me to do (it’s mostly waiting for things to upload / download) but I need to install Cuda on a macine I can’t reboot (so I’ll just leave it alone for several hours to make sure I get a boot)

Something just occurred to me

In the screenshot above where it looks like only one GPU is doing anything this is not definitely what’s happening. The graphs are designed to show graphical use, not computing use. It is therefore possible that all three are working but it’s only showing the copys (which happen to be going to one GPU).

The only way to be sure is to try a lengthy render using three GPUs and time it. Then disconnect two GPUs and time how long one takes. If the time is the same then - oh well, that ideas was wrong…

If, however, it takes more than twice as long with one GPU then the others were being used after all…

Almost got everything uploaded - I’ll be able to see GPU logs using that PC - it’s got a Tesla P40 (well, 25% of one) - Compute Cabability = 6.1, 960 cores for my quarter of it

Well, I now know why Cuda ain’t showing up in your logs - DAE doesn’t log that info from the desktop application (I think it’s going to stderr rather than stdout - a note for DAE bods)

If you use the CLI rather than the desktop you get to see the full info if if you ask for it (I got them to add this opion a while back)

If you download the CLI and unzip it somewhere then type (or cut + paste) something like this into a command prompt you’ll see the GPU info

.\DeepArtEffectsCLI.exe -verbose artfilter -input c:\somedir\somepic.jpg -output c:\somedir\styled.jpg -stylename “Gothic”

Type that in the directory DeepArtEffectsCLI.exe is in substituting real filenames for input (which must already exist) and output (where the rendered image will be saved) and you’ll get a screen full of text with loads of information.

I tried this on the Russian PC (still have 7 hours left after this test…) and got the following

[Extract]

scaling to original!01:55:31.051 [main] INFO  com.deeparteffects.desktopapp.util.c - scaling to original!

waiting for core...01:55:31.059 [main] INFO  com.deeparteffects.desktopapp.util.c - waiting for core...

2022-02-06 01:55:31.126729: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2022-02-06 01:55:31.325085: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2022-02-06 01:55:31.572035: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties:
name: GRID RTX6000P-6Q major: 7 minor: 5 memoryClockRate(GHz): 1.62
pciBusID: 0000:00:0b.0
2022-02-06 01:55:31.579910: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2022-02-06 01:55:31.944889: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2022-02-06 01:55:32.311871: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2022-02-06 01:55:32.359924: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2022-02-06 01:55:32.717856: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2022-02-06 01:55:33.053361: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2022-02-06 01:55:33.368722: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2022-02-06 01:55:33.375064: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
2022-02-06 01:55:34.363966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-02-06 01:55:34.370016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186]      0
2022-02-06 01:55:34.376024: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0:   N
2022-02-06 01:55:34.380407: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3962 MB memory) -> physical GPU (device: 0, name: GRID RTX6000P-6Q, pci bus id: 0000:00:0b.0, compute capability: 7.5)
2022-02-06 01:55:34.455881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties:
name: GRID RTX6000P-6Q major: 7 minor: 5 memoryClockRate(GHz): 1.62
pciBusID: 0000:00:0b.0

[Extract]

==========

The dll / gpu stuff repeated three times the important bit is....

2022-02-06 01:55:34.455881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties:
name: GRID RTX6000P-6Q major: 7 minor: 5 memoryClockRate(GHz): 1.62
pciBusID: 0000:00:0b.0

==========

This shows I was running on a RTX6000P-6Q with a 7.5 compute capability (they’ve upgraded the server - last time I got a P40)

GPUs are numbered starting at zero, so device 0 is the one I was using.

If you try the same thing on your machine that shoould show you what GPUs are in use.

Note - Although the CLI is marked as CPU only by using my version of tensorflow_jni.dll the CLI will actually be using the GPU if it’s available - well, tensorflow will at least, dunno what DAE does internally…

Your output will have different dll names owing to the fact that I tested on Cuda10 (cuda 9 don’t support the GPU on that machine but cuda 9 is fine for yours)

First, thanks again for awesome response!

Whole day i make various tests and now can share my results and feedback.

In my tests probably i catched all possible errors and icebergs.

So, in order of appearance :slight_smile: :

First, i started dae-cli and it’s won’t work without active license, ok, go to dae website in my account, refresh license, activate by cli command dae-cli.exe license -add [key] and it’s is working!

As you said, cli tell me more on console output (dae staff, guys, please can you add more info on debug logs?!)

And it tell me: CUDA is working! (I put your tensorflow .dll and added its foled to env variable Path later)

4 gpu cards founded, and 1 card diasabled with reason “min CUDA 3.5 needed, but 3.0 max supported by my quadro k4000”.

I test with 3xK620, 2xK620 and with 1xK620 (and also with 2xK620 but 1 of these diasabled for computing by Nvidia control panel) and all results say about only one GPU used, no multi-gpu support with tensorflow I used (Peardox’s .dll)

Ok, i also seen in console output about tensorflow builders with no AVX2 CPU instructions support.

CUDA 10 i not tried because on my GPU, add you said later, it is not needed.

SPEED TEST (full time from start DAE CLI to finish)
Same picture 4000x5000px input (this free photo from stock)
Tile size 512x512px (and a bit smaller on borders of image, it is visible in logs)

CPU:
TR4 Ryzen Threadripper 1900x [email protected]
20 tiles time: 25sec (0.8 tile per second)
154 tiles time: 150sec (2m30s) (1.03 tiles per second)

Ryzen 5 2600x [email protected]
20 tiles time: 23 sec (0.87 tile per second)
154 tiles time: 129sec (2m09s) (1.19 tiles per second) - hmm… faster than Threadripper ?!

GPU:
1xК620 Nvidia Quadro “kepler” (384 cores CUDA up to v5.0) with Ryzen 5 2600x (15% CPU load)
20 tiles time: 37 sec (0.54 tile per second)
154 tiles time: 107 sec (1m47s) (1.44 tile per second)

1xК620 with TR4 Threadripper 1900x (12% CPU load)
154 tiles time: 127 sec (1m57s) (1.21 tile per second) test for 1st start DAE after start PC
154 tiles time: 117 sec (1m57s) (1.31 tile per second) test for not 1st (2nd and more) start DAE after start PC

2xК620 with TR4 Threadripper 1900x
154 tiles time: 127 sec (in GPU computing graph looked only 1 gpu used)

And now in front of me some questions:

  • How to activate multi-GPU mode, if it possible (in tensorflow docs (link) we see about some code manipulations for using multi-gpu)

  • How to select tile size with DAE-CLI (my last experimets with GPU and CPU rendering in Blender3D Cicles engine says: for CPU we must use big tiles, but for GPU better to use small tiles and i want to test this theory with DAE)

  • AVX2 support can boost speed of processing? How we can enable it?

  • in ideal scenario, in my best dreams, i see how DAE for more speed can use CPU and multi-GPU with different tile size and blend (merge) all segments in finish. Maybe we can request it for updated version of DAE?

More foundings

I also founded method for visual detecting GPU compute with Windows task manager, siply change the default graph to Compute:

Also i met this alert about memory:

Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.05GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.

And TF says memory size of my K620 (2GB memory on sticker) is 1393 MB (more info in logs attached)

some cropped console outputs

//Installed 2 К620, CPU: TR4 Threadripper 1900x
//1xК620 for 20 tiles time 37 sec

20:46:19.644 verbosing output!

2022-02-06 20:46:21.502395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: Quadro K620 major: 5 minor: 0 memoryClockRate(GHz): 1.124

2022-02-06 20:46:22.244563: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1393 MB memory) → physical GPU (device: 0, name: Quadro K620, pci bus id: 0000:09:00.0, compute capability: 5.0)

20:46:28.705 [main] DEBUG com.deeparteffects.core.stitcher.d - Source width: 1536, height: 1920

2022-02-06 20:46:30.464371: W tensorflow/core/common_runtime/bfc_allocator.cc:237] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.05GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.

//10 messages about memory (20 tiles used)

20:46:55.786 [main] DEBUG com.deeparteffects.core.a - Render tile 20 of 20

20:46:56.318 finished!


//1xК620 for 154 tiles time 117 sec

21:40:42.009 verbosing output!

21:40:43.287 scaling to original!

2022-02-06 21:40:44.580487: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: Quadro K620 major: 5 minor: 0 memoryClockRate(GHz): 1.124

2022-02-06 21:40:44.599498: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1394 MB memory) → physical GPU (device: 0, name: Quadro K620, pci bus id: 0000:09:00.0, compute capability: 5.0)

21:40:50.890 [main] DEBUG com.deeparteffects.core.stitcher.d - Source width: 4000, height: 5000
21:42:36.059 [main] DEBUG com.deeparteffects.core.a - Render tile 154 of 154

21:42:39.140 - finished!


//2xК620 for 154 tiles time 127 sec (in GPU computing graph looked only 1 gpu used)

21:51:05.907 verbosing output!

scaling to original!

2022-02-06 21:51:13.126230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: Quadro K620 major: 5 minor: 0 memoryClockRate(GHz): 1.124

2022-02-06 21:51:13.130753: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 1 with properties:
name: Quadro K620 major: 5 minor: 0 memoryClockRate(GHz): 1.124

2022-02-06 21:51:13.151977: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1394 MB memory) → physical GPU (device: 0, name: Quadro K620, pci bus id: 0000:09:00.0, compute capability: 5.0)
2022-02-06 21:51:13.157081: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 1394 MB memory) → physical GPU (device: 1, name: Quadro K620, pci bus id: 0000:41:00.0, compute capability: 5.0)

21:51:19.901 [main] DEBUG com.deeparteffects.core.stitcher.d - Source width: 4000, height: 5000

21:53:09.050 [main] DEBUG com.deeparteffects.core.a - Render tile 154 of 154

21:53:12.727 - finished!

In my cases, its repeated up ti 5 times, i cropped it and other noise for readability…

I never spotted the compute thing - learn something new every day :slight_smile:

I’ll look at your results properly later - about to have a nap but before I do…

o They are already working on a new version (https://forum.deeparteffects.com/t/tensorflow-2-is-on-the-way/) but it appears to have been delayed cos of the Log4j thing in December.

o AVX2 requires that I build a new version of the DLL, I’m not doing that, it’s a waste of time as TF2 is coming sometime.

o Having said that my Cuda 10 DLLs should have AVX2 enabled. Actually they must have as my full log for the CLI run didn’t bitch about AVX2 and that machine definitely has that facility.

o In regard to multiple GPUs, well, I’ve only got one - why should anyone else have more :slight_smile: Actually, it’s not something I’ve looked into - I presumed that it would use more than one. TF2 may automatically resolve this - we’ll find out when it arrives.

o Don’t worry too much about the allocation messages, if it actually has a problem it’ll let you know in (DAE will actually crash in this event just after TF screams about it). Looking at your messages you’ll have problems with larger chunk sizes.

o The ‘tile size’ you’re on about is controlled by adding, for example, “-chunksize 768” to the cli command, you can also set it in the GUI preferences but you get limited choice in the GUI. Chunksize can be any number from 128 to 2048. Be aware though that large chunk sizes use a LOT of GPU so at some point you WILL run out of memory (and DAE will crash). My old card (it died a long time ago) only had 2GB and would crash using a chunk size > 512 (which is probably why this is the default in the GUI)

o if you type DeepArtEffectsCLI -help it’ll show you all the options available - there are quite a lot of them, my example above was deliberately the simplest version that would output detailed logs.

o The wiki article I based my assumption that the k4000 would work was wrong (I’ll got edit the article - not by me, but it’s wrong and I do check it often)

o The “More logs please” - I put a note in my message about that specifically so DAE would notice it - they read these forums, sometimes they even reply :slight_smile:

Hi, yesterday i forget share this intrest log, in whitch tensorflow said “please set env var for me, because I’m not in the mood today”

See cropped DAE-CLI console output

// GPU installed and active in NVidia control panel, but NOT USED. CPU 20 tiles 512x512 R5 2600x time 24sec

20:12:56.827 [main] INFO com.deeparteffects.desktopapp.util.c - verbosing output!

gpu_device.cc:1640] Found device 0 with properties:
name: Quadro K620 major: 5 minor: 0 memoryClockRate(GHz): 1.124

gpu_device.cc:1640] Found device 1 with properties:
name: Quadro K4000 major: 3 minor: 0 memoryClockRate(GHz): 0.8105

gpu_device.cc:1748] Ignoring visible gpu device (device: 0, name: Quadro K620, pci bus id: 0000:07:00.0, compute capability: 5.0) with core count: 3. The minimum required count is 4. You can adjust this requirement with the env var TF_MIN_GPU_MULTIPROCESSOR_COUNT.

gpu_device.cc:1717] Ignoring visible gpu device (device: 1, name: Quadro K4000, pci bus id: 0000:06:00.0, compute capability: 3.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.

20:13:20.329 [main] INFO com.deeparteffects.desktopapp.util.c - finished!

And then i set env var to “2” value (idiot’s gude helps me again) and on next start my GPU K620 was used by DAE for process image!

What is “3 cores” on this GPU… no idea, may be you know?


About AVX2, it look like only your tensorflow.dll was copiled without AVX2 support (but it is made as i understand special for CUDA usage and until DAE not supported mixed GPU+CPU mode (with different chunk sizes :star_struck: of course, your @Peardox .dll is OK)

Ok. i’ll test it, but my theory (based on blender3d expierence): small chunk size for GPU, and big chunk size for CPU.
CPU = high Ghz and many RAM but low count of cores
GPU = lower RAM, lower Ghz but much more cores (and can parrallel render more small tiles)


i am afraid not without DAE staff some code magic, and i hope in future release it will be here :+1:
Thanks to DAE staff for include me to beta-testers program. Now i waiting for fresh versions for experimetations …


Thanks for suggest. Maybe (ohh. exactly) it is offtop, but it is time to request here:
Can i change in CLI whitch part of image will be affected (all, only background or only foreground) like it i can set in DAE desktop:

Because artfilter and removebackground are different options as i seen in -help but this effect (in desktop GUI) is part (or option) of artfilter

1 Like

O, i found video from DAE with explaination about chunk size and now i understand @Peardox words about its and my words:

were a mistake. Quite probable in some cases i will need the bigger chunk size with GPU for my artworks…

This is reason for look for new cool nvidia GPU but they mining-accelerated prices scare me 0_о

@deeparteffects, what about plans for future with ROCm support? TF can this

hey @drams pls put that in the feature request section for a better overview.

Type …

DeepArtEffectsCLI -help > help.txt

You now have a file called help.txt with all the commands in an easy to read text file

removebackground doesn’t actually have any options for the background type. It always removes the background with transparency (then DAE GUI adds one back in if you ask it to). Actually, technically it makes the alpha channel an 8-bit depth-map - would be fun in Blender if you know what to do :slight_smile:

If you want to add a background back in then use a paint package or ImageMagick for CLI stuff

When replying last night I though you’d included the full log (you just gave me my own extract back) bit looking at the snippet above did TF_MIN_GPU_MULTIPROCESSOR_COUNT work proiperly and you saw it in compute stuff? As I’ve never see DAE on a multi-gpu machine I have no way of telling what will happen.

Yeah, small chunk for GPU, large for CPU = no crash. Actually by putting my tensorflow_jni.dll in the path what you’ve done is replaced the DAE version so with mine in path CPU would also use GPU if found (i.e. if you really want CPU - remove my dll from the path - renaming the dll works fine…)

3 Cores will be your 3x k620 - confusing terminology by TF. A GPU is just a black box that does things as far as TF is concerned - so 3x black boxes :slight_smile: