Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Advertise on LowEndTalk.com

Enable hardware crypto acceleration on your Via Nano dedi
New on LowEndTalk? Please read our 'Community Rules' by clicking on it in the right menu!

Enable hardware crypto acceleration on your Via Nano dedi

rm_rm_ Member
edited December 2014 in Tutorials

VIA PadLock Advanced Cryptography Engine (VIA PadLock ACE) is a technology used in many VIA processors that provides very fast hardware encryption and decryption.

Via Nano used in Online.net Kidechires and SCGen2 does support the Padlock extensions as well. In my tests enabling them in OpenSSL improves SHA1 performance by almost 4x, and drops the CPU load from HTTPS by 3x.

See my howto on rebuilding OpenSSL in Debian with Padlock support.

Let me know if it worked or not for you, also if it needs any clarifications or corrections.

«1

Comments

  • sounds good :p

  • This is actually immensely nifty.

  • Nice, thanks for the article!

    Is it possible to use this padlock thing with dm-crypt?

  • Unable to locate package libz1g-dev

    any suggestion?

  • rm_rm_ Member
    edited December 2014

    ben78 said: Is it possible to use this padlock thing with dm-crypt?

    I believe dm-crypt should use it automatically, unrelated to rebuilding OpenSSL (since it's a kernel module and just uses the in-kernel crypto API directly), just make sure you use the cipher supported in hardware ("aes-cbc").

    Of course your kernel should be compiled with it:

    $ grep PADLOCK /boot/config-*  
    /boot/config-3.14.22-rm1+:CONFIG_CRYPTO_DEV_PADLOCK=m  
    /boot/config-3.14.22-rm1+:CONFIG_CRYPTO_DEV_PADLOCK_AES=m  
    /boot/config-3.14.22-rm1+:CONFIG_CRYPTO_DEV_PADLOCK_SHA=m

    But I think that's already enabled by default in all distros.

  • You can also enable hardware acceleration in TOR, my cpu load is now 0.40 not 2.00, and the bandwidth has gone up quite a lot.

    https://globe.thecthulhu.com/#/relay/9E8E20CD0B6F0DD91F320C9149CD51958E4C0357

  • trexostrexos Member
    edited December 2014

    @linuxthefish said:
    You can also enable hardware acceleration in TOR, my cpu load is now 0.40 not 2.00, and the bandwidth has gone up quite a lot.

    https://globe.thecthulhu.com/#/relay/9E8E20CD0B6F0DD91F320C9149CD51958E4C0357

    Mh misterious. I get 6,4MB (advertised) without changing this.

    OnePoundWebHosting.co.uk | UK XEN VPS from £2 | See their special offers starting from 12£/year here

  • error occurs when execute "apt-get install devscripts fakeroot build-essential libz1g-dev"

    Unable to locate package libz1g-dev

    does anyone can help? :(

  • @hotsnow sorry, this seems to be a typo, try "zlib1g-dev".

  • hotsnowhotsnow Member
    edited December 2014

    @rm_ said:
    hotsnow sorry, this seems to be a typo, try "zlib1g-dev".

    yep, it's ok now, thanks :)

  • trexos said: Mh misterious. I get 6,4MB (advertised) without changing this.

    What's your relay name?

  • @linuxthefish said:
    What's your relay name?

    PM'ed you

    OnePoundWebHosting.co.uk | UK XEN VPS from £2 | See their special offers starting from 12£/year here

  • @linuxthefish said:
    You can also enable hardware acceleration in TOR, my cpu load is now 0.40 not 2.00, and the bandwidth has gone up quite a lot.

    I'm not 100 % sure, but isn't hardware acceleration enabled by default for Tor (if available) when used with OpenSSL 1.0.1+?

  • Nyr said: when used with OpenSSL 1.0.1+?

    You still need to patch OpenSSL, as even 1.0.1+ does not include the Padlock support in its "default" form.

  • rm_ said: You still need to patch OpenSSL, as even 1.0.1+ does not include the Padlock support in its "default" form.

    I know, but Tor needed to be explicitly configured to use hardware acceleration with older OpenSSL versions and that's no longer the case IIRC, that's what I was asking :)

  • How to do thsi on centos?

  • xDutchyxDutchy Member
    edited December 2014

    @sepei said:
    How to do thsi on centos?

    1) backup files

    2) reinstall to debian

    3) restore files

    4) use https://romanrm.net/openssl-padlock

  • A solution without changing the OS would be cool lol

  • sepei, could be that its already in the tree

    openssl engine padlock should not give out an error

  • Do you need to restart the processes after applying the patch? I don't want to lose my uptime on tor if I don't have to.

  • black said: Do you need to restart the processes after applying the patch?

    Yes you do, else they keep using the previous version of the OpenSSL library.

    Thanked by 1black
  • I get this error at the dpkg-build bit near the end:

    created directory `/root/openssl-1.0.1e/debian/tmp/usr/share/man/man7'

    installing man1/CA.pl.1ssl
    installing man1/asn1parse.1ssl
    installing man1/c_rehash.1ssl
    installing man1/ca.1ssl
    installing man1/ciphers.1ssl
    installing man1/cms.1ssl
    cms.pod around line 457: Expected text after =item, not a number
    cms.pod around line 461: Expected text after =item, not a number
    cms.pod around line 465: Expected text after =item, not a number
    cms.pod around line 470: Expected text after =item, not a number
    cms.pod around line 474: Expected text after =item, not a number
    POD document had syntax errors at /usr/bin/pod2man line 71.
    make[1]: *** [install_docs] Error 255
    make[1]: Leaving directory `/root/openssl-1.0.1e'
    make: *** [install] Error 2
    dpkg-buildpackage: error: debian/rules binary gave error exit status 2

    Not really sure how to advance. I'm new to compiling stuff too, so sorry if this was obvious.

  • Hmm, looks like this patch crashes tor for me

    Tor 0.2.5.10 (git-43a5f3d91e726291) died: Caught signal 11
    /usr/bin/tor(+0x1229de)[0xb76839de]
    /usr/lib/i386-linux-gnu/openssl-1.0.0/engines/libpadlock.so(+0x1507)[0xb753b507]
    /usr/lib/i386-linux-gnu/openssl-1.0.0/engines/libpadlock.so(+0x1507)[0xb753b507]
    /usr/lib/i386-linux-gnu/i686/cmov/libcrypto.so.1.0.0(EVP_DigestUpdate+0x1d)[0xb736140d]
    /usr/lib/i386-linux-gnu/i686/cmov/libcrypto.so.1.0.0(EVP_Digest+0x88)[0xb73616d8]
    /usr/lib/i386-linux-gnu/i686/cmov/libcrypto.so.1.0.0(RSA_padding_add_PKCS1_OAEP+0x98)[0xb733c048]
    /usr/lib/i386-linux-gnu/i686/cmov/libcrypto.so.1.0.0(+0xa17e6)[0xb73387e6]
    /usr/lib/i386-linux-gnu/i686/cmov/libcrypto.so.1.0.0(RSA_public_encrypt+0x30)[0xb7340410]
    /usr/bin/tor(crypto_pk_public_encrypt+0xb7)[0xb769c427]
    /usr/bin/tor(crypto_pk_public_hybrid_encrypt+0x152)[0xb76a2802]
    /usr/bin/tor(onion_skin_TAP_create+0x1a8)[0xb759fc88]
    /usr/bin/tor(onion_skin_create+0x6a)[0xb759dc1a]
    /usr/bin/tor(circuit_send_next_onion_skin+0x52e)[0xb75ff9de]
    /usr/bin/tor(+0x49fa7)[0xb75aafa7]
    /usr/bin/tor(circuit_receive_relay_cell+0x232)[0xb75ad422]
    /usr/bin/tor(command_process_cell+0x1f8)[0xb76162d8]
    /usr/bin/tor(channel_queue_cell+0x223)[0xb75f3d83]
    /usr/bin/tor(channel_tls_handle_cell+0x29c)[0xb75f877c]
    /usr/bin/tor(+0xda12a)[0xb763b12a]
    /usr/bin/tor(+0xc7e18)[0xb7628e18]
    /usr/bin/tor(connection_handle_read+0x70b)[0xb762ef7b]
    /usr/bin/tor(+0x2bcc1)[0xb758ccc1]
    /usr/lib/i386-linux-gnu/libevent-2.0.so.5(event_base_loop+0x3c2)[0xb74be522]
    /usr/bin/tor(do_main_loop+0x1af)[0xb758d69f]
    /usr/bin/tor(tor_main+0x22bd)[0xb759132d]
    /usr/bin/tor(main+0x33)[0xb7589963]
    /lib/i386-linux-gnu/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xb713ae46]
    /usr/bin/tor(+0x289ad)[0xb75899ad]
    
  • I am currently doing 3 MB with a 57-60% load. It was going very well, always up until the 10th or so, then dropped to half, stayed there for 3 days, then starting to go back up. I was never limited by CPU, I am not sure it is a BW issue either, each time I tried I had good connection on top of Tor.
    I will wait to see if it saturates the CPU and if it does, at how much traffic per second that happens. Then I will see what can be done, but until the CPU is not saturated, I suspect other causes.

    Extremist conservative user, I wish to preserve human and civil rights, free speech, freedom of the press and worship, rule of law, democracy, peace and prosperity, social mobility, etc. Now you can draw your guns.

  • @Maounique said:

    Remember, a Tor node doesn't reach its full capacity until it has been up for two months.

    I got some dedis with the last offer and Tor is working slower than previously for me too, but I expect it to hopefully speed up during January.

  • Nyr said: Remember, a Tor node doesn't reach its full capacity until it has been up for two months.

    I know but that does not explain the halving of traffic on the 10th and stay down at about same level for 3 days for no apparent reason, my attempts to see what is wrong failed so, I guess I will have to wait and see.

    Extremist conservative user, I wish to preserve human and civil rights, free speech, freedom of the press and worship, rule of law, democracy, peace and prosperity, social mobility, etc. Now you can draw your guns.

  • coolnow said: I get this error at the dpkg-build bit near the end

    Which OS do you use? There are some fixes for your error messages: https://startpage.com/do/search?cat=web&cmd=process_search&language=english&engine0=v1all&query="cms.pod+around+line+457:+Expected+text+after+=item,+not+a+number"&abp=1&x=0&y=0
    But I'm wondering why you got these and I haven't.

    black said: Hmm, looks like this patch crashes tor for me

    Personally did not try Tor with these patches, do other apps work fine? E.g. the "openssl sha1" test mentioned in the howto, or "openssl speed -evp aes-128-cbc -engine padlock"?

  • rm_ said: Personally did not try Tor with these patches, do other apps work fine? E.g. the "openssl sha1" test mentioned in the howto, or "openssl speed -evp aes-128-cbc -engine padlock"?

    Yep. There's a significant performance gain with the padlock patch on openSSL. Tor seems to be the only program that doesn't work.

    openssl speed -evp aes-128-cbc -engine padlock
    engine "padlock" set.
    built on: Wed Dec 17 21:13:39 CET 2014
    options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) 
    compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wa,--noexecstack -Wall -march=i686 -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
    The 'numbers' are in 1000s of bytes per second processed.
    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    aes-128-cbc      98783.53k   306857.46k   619254.45k   828959.48k   916095.09k
    
  • rm_ said: Which OS do you use? There are some fixes for your error messages: https://startpage.com/do/search?cat=web&cmd=process_search&language=english&engine0=v1all&query="cms.pod+around+line+457:+Expected+text+after+=item,+not+a+number"&abp=1&x=0&y=0

    But I'm wondering why you got these and I haven't.

    Ubuntu 14.10

    Thanks for the suggestion but i fixed it by using a patch that patched the manpages. Now i'm seeing 200+MB/s on your test, up from ~45MB/s. Thanks a lot for the thread, this might allow me to squeeze some extra juice out of my dedis.

  • DylanDylan Member
    edited December 2014

    @Maounique said:
    I know but that does not explain the halving of traffic on the 10th and stay down at about same level for 3 days for no apparent reason, my attempts to see what is wrong failed so, I guess I will have to wait and see.

    https://blog.torproject.org/blog/lifecycle-of-a-new-relay

    A new relay, assuming it is reliable and has plenty of bandwidth, goes through four phases: the unmeasured phase (days 0-3) where it gets roughly no use, the remote-measurement phase (days 3-8) where load starts to increase, the ramp-up guard phase (days 8-68) where load counterintuitively drops and then rises higher, and the steady-state guard phase (days 68+).

    You got your Kidechire on November 28th, right, and I think you said in another thread you set up Tor a few days after that? A load drop on December 10th would seem to fit the new relay timeline pretty closely.

    Thanked by 2Maounique raindog308
  • My tor also crash after the patch. As other people said in the thread the test went from 35/30 MB/s so 190/210 MB/s. However now tor crash a while after startup with no error messages, even in the debug log :/

  • @Giulio said:
    My tor also crash after the patch. As other people said in the thread the test went from 35/30 MB/s so 190/210 MB/s. However now tor crash a while after startup with no error messages, even in the debug log :/

    If you install tor from their repos, you'll see what I pasted in /var/log/tor/log

  • @black add "HardwareAccel 1" in your torrc.

    Thanked by 1black
  • Dylan said: You got your Kidechire on November 28th, right, and I think you said in another thread you set up Tor a few days after that? A load drop on December 10th would seem to fit the new relay timeline pretty closely.

    I understand it will not be stable, but it stayed down for 3 days or so, I don't remember having this large drop before, lets hope it was that :)

    Extremist conservative user, I wish to preserve human and civil rights, free speech, freedom of the press and worship, rule of law, democracy, peace and prosperity, social mobility, etc. Now you can draw your guns.

  • blackblack Member
    edited December 2014

    linuxthefish said: @black add "HardwareAccel 1" in your torrc.

    Thanks, that fixed tor :)

    Edit: nevermind, it crashed again for the same reason.

  • @jaakka said:
    sepei, could be that its already in the tree

    openssl engine padlock should not give out an error

    Do you mena its already in my openssl. If that was that you wanted to say I zhink no. I'm getting arround 45MB/s. So again how i can do this on centos

  • @Maounique said:
    I understand it will not be stable, but it stayed down for 3 days or so, I don't remember having this large drop before, lets hope it was that :)

    Hopefully! My experience setting up a few new relays lately is that Tor can be very oddly inconsistent. In the case of two identical servers on the same network, one became a guard after 4 days (which I thought was supposed to be impossible), while the other took almost 3 weeks.

  • coolicecoolice Member
    edited January 2015

    @sepei said:
    How to do thsi on centos?

    A little bit necro posting but not a lot :)

    for centos it is even simple just add few lines to config

    http://www.linuxtopia.org/online_books/rhel6/rhel_6_security_guide/rhel_6_security_ch03s07.html

    ~3 times the speed

     dd if=/dev/zero count=100 bs=1M | ssh -c aes128-cbc localhost "cat >/dev/null"
    [email protected]'s password: 
    100+0 records in
    100+0 records out
    104857600 bytes (105 MB) copied, 7,82336 s, 13,4 MB/s 
    dd if=/dev/zero count=100 bs=1M | ssh -c aes128-cbc localhost "cat >/dev/null"
    [email protected]'s password: 
    100+0 records in
    100+0 records out
    104857600 bytes (105 MB) copied, 3,58612 s, 29,2 MB/s 

    fast pasting password with shift + ins

    OpenVz Node + KernelCare uptime - 1275 Days :)

  • kidechire with a fresh installed debian system is showing me this:

    [email protected]:~# openssl speed -evp aes-128-cbc -engine padlock
    engine "padlock" set.
    Doing aes-128-cbc for 3s on 16 size blocks: 15563467 aes-128-cbc's in 3.00s
    Doing aes-128-cbc for 3s on 64 size blocks: 9630568 aes-128-cbc's in 2.99s
    Doing aes-128-cbc for 3s on 256 size blocks: 3437139 aes-128-cbc's in 3.00s
    Doing aes-128-cbc for 3s on 1024 size blocks: 1063825 aes-128-cbc's in 2.97s
    Doing aes-128-cbc for 3s on 8192 size blocks: 154125 aes-128-cbc's in 3.00s
    OpenSSL 1.0.1e 11 Feb 2013
    built on: Thu Jan  8 21:47:50 UTC 2015
    options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) 
    compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wa,--noexecstack -Wall -march=i686 -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
    The 'numbers' are in 1000s of bytes per second processed.
    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    aes-128-cbc      83005.16k   206139.25k   293302.53k   366786.80k   420864.00k
    

    Does it mean the openssl/libssl is already padlock enabled? If not, whats the way to find out?

  • rm_rm_ Member
    edited February 2015

    @akb what do you get in

    dd if=/dev/zero bs=1M count=512 | openssl sha256

    Should be ~230MB/sec.

  • [email protected]:~# dd if=/dev/zero bs=1M count=512 | openssl sha256
    512+0 records in
    512+0 records out
    536870912 bytes (537 MB) copied, 10.9468 s, 49.0 MB/s
    (stdin)= 9acca8e8c22201155389f65abbf6bc9723edc7384ead80503839f49dcc56d767
    

    It mean hardware acceleration isn't getting used? I thought:

    [email protected]:~#  openssl engine padlock
    (padlock) VIA PadLock (no-RNG, ACE)
    

    And 'engine "padlock" set.' in the output of above command is reflecting the use of HA by OpenSSL.

  • Is there a stable solution for Tor now? I'd like to push a little bit more bandwidth on my online.net boxes, they don't manage to do more than 10TB/month.

    tsdns.io - free, redundant, DDoS-protected TSDNS

  • NyrNyr Member

    @tr1cky

    Then you are doing something wrong. I do nearly 30TB with no hardware acceleration.

    I did set up acceleration, but had problems with Tor itself (don't really remember what happened, but not easily fixable). With CentOS 7, native support is there, no need to recompile OpenSSL.

  • Has anyone tried this with Tor?

    HardwareAccel 1
    AccelName padlock
    
    [notice] Default OpenSSL engine for AES-128-ECB is VIA PadLock (no-RNG, ACE) [padlock]
    [notice] Default OpenSSL engine for AES-128-CBC is VIA PadLock (no-RNG, ACE) [padlock]
    [notice] Default OpenSSL engine for AES-256-CBC is VIA PadLock (no-RNG, ACE) [padlock]
    

    .. but the CPU load is the same or even worse on CentOS (padlock enabled) in comparison to Ubuntu (padlock not enabled). ... :(

    (((o(゚▽゚)o))) If privacy is outlawed, only outlaws will have privacy. (((o(゚▽゚)o)))

    ヽ(`Д´)ノ Everyone should run Tor on their idle servers.

  • I don't remember the specifics now, but didn't have much luck with PadLock and Tor either.

    Thanked by 14n0nx
  • @4n0nx I tried and failed :(

    Thanked by 14n0nx
  • facepalms

    I could not find it the tens of times I have searched, but now I just found this:

    https://www.mail-archive.com/[email protected]/msg74018.html

    Looks like Tor prefers AES128-GCM, which padlock does not support -> GCM has to be disabled. Then it still doesn't work because somehow OpenSSL does not work with padlock the way it should ("SHA calls"?). The "bug" was closed as "not a bug".

    This is only a TL;DR;DR (as in I did not read it all either).

    So much time wasted :(

    (((o(゚▽゚)o))) If privacy is outlawed, only outlaws will have privacy. (((o(゚▽゚)o)))

    ヽ(`Д´)ノ Everyone should run Tor on their idle servers.

  • rm_rm_ Member
    edited September 2015

    OpenSSL does not support Padlock's SHA acceleration at all.

    This doesn't sound right, as my primary way of testing if Padlock patched OpenSSL has been installed correctly is actually "openssl sha256"... It's 4 times faster with padlock (230 MB/sec vs 60 MB/sec).

  • xyzxyz Member
    edited September 2015

    I don't know specifically about Tor, but acceleration (i.e. using an alternative engine) only works if the application makes calls via EVP. If calls are being made direct to the default SHA implementation, it doesn't go through EVP and hence doesn't use the Padlock engine.

    If you're bothered to, it should be possible to patch the source code to make calls through EVP (assuming that this is the issue).

  • At least as of 2006, EncFS did not support the Padlock engine.

    Does anybody have more recent data on this?

Sign In or Register to comment.