{"id":12314,"date":"2023-12-01T00:12:49","date_gmt":"2023-12-01T05:12:49","guid":{"rendered":"https:\/\/spinor.info\/weblog\/?p=12314"},"modified":"2023-12-01T00:12:49","modified_gmt":"2023-12-01T05:12:49","slug":"move-over-gpt-make-room-for-the-llama","status":"publish","type":"post","link":"https:\/\/spinor.info\/weblog\/?p=12314","title":{"rendered":"Move over, GPT, make room for&#8230; the llama?"},"content":{"rendered":"<p>Well, here it is, a local copy of a <a href=\"https:\/\/github.com\/Mozilla-Ocho\/llamafile\">portable large language and visual model<\/a>. An everywhere-run executable in a mere 4 GB. Here&#8217;s my first test, with a few random questions and an image (one of my favorite Kliban cartoons) to analyze:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-12315\" src=\"https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2023\/12\/llama.png\" alt=\"\" width=\"478\" height=\"766\" srcset=\"https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2023\/12\/llama.png 478w, https:\/\/spinor.info\/weblog\/wp-content\/uploads\/2023\/12\/llama-187x300.png 187w\" sizes=\"(max-width: 478px) 100vw, 478px\" \/><\/p>\n<p>Now 4.57 tokens per second is not exactly fast but hey, it runs on my 7-year old workstation, with no GPU acceleration, and yet, its performance is more than decent.<\/p>\n<p>How is this LLM different from GPT or Claude? Well, it requires no subscription, no Internet connection. It is entirely self-contained, and fast enough to run on run-of-the-mill PC hardware.<\/p>\n<fb:like href='https:\/\/spinor.info\/weblog\/?p=12314' send='true' layout='standard' show_faces='true' width='450' height='65' action='like' colorscheme='light' font='lucida grande'><\/fb:like>","protected":false},"excerpt":{"rendered":"<p>Well, here it is, a local copy of a portable large language and visual model. An everywhere-run executable in a mere 4 GB. Here&#8217;s my first test, with a few random questions and an image (one of my favorite Kliban cartoons) to analyze: Now 4.57 tokens per second is not exactly fast but hey, it <a href='https:\/\/spinor.info\/weblog\/?p=12314' class='excerpt-more'>[&#8230;]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[58,35],"tags":[],"class_list":["post-12314","post","type-post","status-publish","format-standard","hentry","category-cybernetics","category-personal","category-58-id","category-35-id","post-seq-1","post-parity-odd","meta-position-corners","fix"],"_links":{"self":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/12314","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=12314"}],"version-history":[{"count":2,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/12314\/revisions"}],"predecessor-version":[{"id":12317,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=\/wp\/v2\/posts\/12314\/revisions\/12317"}],"wp:attachment":[{"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=12314"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=12314"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/spinor.info\/weblog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=12314"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}