Machine learning

    [N] Diffusion Models Live Event

    [N] Diffusion Models Live Event

    https://preview.redd.it/p7hlm9i3i42a1.png?width=1632&format=png&auto=webp&s=3ca84b1e45b077e3c4ff38506635a1f01463a14a

    Hi there, it's Lewis here from Hugging Face 👋

    Our diffusion models class with Jonathan Whitaker kicks-off next week, and to celebrate we're hosting a live event of talks and discussion with the creators of Stable Diffusion and folks from Stability AI, Meta, and Lambda Labs 🧨!

    If you'd like to take part, you can sign up here: https://huggingface.co/blog/diffusion-models-event

    submitted by /u/lewtun
    [link] [comments]

    Source link

    Click here to read more

    [Project] Background removal tool based on our recent work "Revisiting Image Pyramid Structure for High Resolution Salient Object Detection"

    [Project] Background removal tool based on our recent work "Revisiting Image Pyramid Structure for High Resolution Salient Object Detection"

    We made a background removal tool named as transparent-background based on our recent work "Revisiting Image Pyramid Structure for High Resolution Salient Object Detection (InSPyReNet)" which will be published in ACCV 2022.

    For better performance, we trained our model on various salient object detection datasets which are publicly available. We think our tool actually works better than currently available tools like Apple's recent background removal tool for IOS and macOS, or https://www.remove.bg.

    You can use our tool as a command-line tool or python API.

    Please visit our github repository and try out your images and videos.

    transparent-background: https://github.com/plemeri/transparent-background

    InSPyReNet: https://github.com/plemeri/InSPyReNet

    Here is a sample result of apple's recent background removal tool and our tool.

    Input image

    Result from Apple's recent background removal tool

    Result from our tool "transparent-background"

    submitted by /u/swdsld
    [link] [comments]

    Source link

    Click here to read more

    [D] Differentiate between same background negative and positive image

    [D] Differentiate between same background negative and positive image

    I'm building a binary classifier that is going to classify whether there are insects present in the image or not. Negative images can be of any object(e.g. bike, car, plant, simple white image, etc) but positive images have insects on a white sheet/background(refer to example image for reference).

    https://preview.redd.it/7knopbijjb1a1.png?width=300&format=png&auto=webp&s=1cb74d603aa4be5e2a7d8b8ffb6dc6ba59a6dd09

    Now the classifier is able to differentiate between easy negatives(like bike and insect or plant and insect). But whenever there is white background in the negative image it classifies it as positive(i.e insects are present). I have tried data augmentation like pixel dropout, and coarse dropout, but there. are no improvements. I also tried using focal loss but with focal loss, even for more than double the epochs I trained with cross-entropy it's performing worse, and it's taking way longer to train.

    I'm fine-tuning resnet50(from here) for classification.

    Can anyone suggest how I can deal with this issue?

    submitted by /u/Curious_Monkey7
    [link] [comments]

    Source link

    Click here to read more

    [R] RWKV-4 7B release: an attention-free RNN language model matching GPT-J performance (14B training in progress)

    [R] RWKV-4 7B release: an attention-free RNN language model matching GPT-J performance (14B training in progress)

    Hi everyone. I have finished training RWKV-4 7B (an attention-free RNN LLM) and it can match GPT-J (6B params) performance. Maybe RNN is already all you need 🙂

    https://preview.redd.it/71cce2y75j0a1.png?width=1336&format=png&auto=webp&s=5af76abc4f42fd63f0194ee93f78db01c1b21d97

    Previous discussion: https://www.reddit.com/r/MachineLearning/comments/xfup9f/r_rwkv4_scaling_rnn_to_7b_params_and_beyond_with/

    RWKV has both RNN & GPT mode. The RNN mode is great for inference. The GPT mode is great for training. Both modes are faster than usual transformer and saves VRAM, because the self-attention mechanism is replaced by simpler (almost linear) formulas. Moreover the hidden state is tiny in the RNN mode and you can use it as an embedding of the whole context.

    Github: https://github.com/BlinkDL/RWKV-LM

    Checkpt: https://huggingface.co/BlinkDL/rwkv-4-pile-7b

    14B in progress (thanks to EleutherAI and Stability). Nice spike-free loss curves:

    https://preview.redd.it/w4g7oqmi5j0a1.png?width=868&format=png&auto=webp&s=346d420fb879fd06470079eeaf2e4d3739536406

    submitted by /u/bo_peng
    [link] [comments]

    Source link

    Click here to read more

    [Research] MinD-Vis: Seeing Beyond the Brain - Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding

    [Research] MinD-Vis: Seeing Beyond the Brain - Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding

    MinD-Vis: Another paper about decoding images from brain data with AI. The BCI-->AI-Pipeline seems to be getting faster after a paper from MetaAI, a preprint on bioxiv about the Semantic reconstruction of continuous language from non-invasive brain recordings and a GAN that reproduces faces seen by a monkey.

    https://preview.redd.it/fuctf8ctr20a1.png?width=1784&format=png&auto=webp&s=c71315d5c54485c9a2719739338c175d9b6fa27b

    Results in this paper are only somewhat accurate as of now, but given the speed of improvements this year, this can go very fast.

    Time to get serious about the impact of BCI/AI on Neuroprivacy, not to speak of the psychological effects of technology that can read your thoughts and generate plausible output, maybe even without you noticing. Just as hackers can 3Dscan a room with WiFi signals, this technique may be used to infer neural data (from blood pressure? Heat?).

    Also, very soon, AI systems may be trained directly from neural data to produce mimetic AI-models which can act to optional extends on behalf of the owner of that data and the questions rising with these outlooks are anything but adressed.

    ---

    Paper: https://arxiv.org/abs/2211.06956

    From their project page:

    Motivation

    Decoding visual stimuli from brain recordings aims to deepen our understanding of the human visual system and build a solid foundation for bridging human vision and computer vision through the Brain-Computer Interface. However, due to the scarcity of data annotations and the complexity of underlying brain information, it is challenging to decode images with faithful details and meaningful semantics.

    Contribution

    In this work, we present MinD-Vis: Sparse Masked Brain Modeling with Double-Conditioned Diffusion Model for Vision Decoding. Specifically, by boosting the information capacity of representations learned in a large-scale resting-state fMRI dataset, we show that our MinD-Vis framework reconstructed highly plausible images with semantically matching details from brain recordings with very few training pairs. We benchmarked our model and our method outperformed state-of-the-arts in both semantic mapping (100-way semantic classification) and generation quality (FID) by 66% and 41%, respectively. Exhaustive ablation studies are conducted to analyze our framework.

    Highlights

    _ A human visual decoding system that only reply on limited annotations.

    _ State-of-the-art 100-way top-1 classification accuracy on GOD dataset: 23.9%, outperforming the previous best by 66%.

    _ State-of-the-art generation quality (FID) on GOD dataset: 1.67, outperforming the previous best by 41%.

    _ For the first time, we show that non-invasive brain recordings can be used to decode images with similar performance as invasive measures.

    submitted by /u/walt74
    [link] [comments]

    Source link

    Click here to read more

    [P] FastDeploy: Awesome AI model deployment toolkits.(support 150+ Text,Vision,Speech AI models, provide an Easy-to-use API for deploying CV model,(For example: three lines of core code can deploy the YOLO series model),support deployment among server, mobile, embedded and IoT devices)

    [P] FastDeploy: Awesome AI model deployment toolkits.(support 150+ Text,Vision,Speech AI models, provide an Easy-to-use API for deploying CV model,(For example: three lines of core code can deploy the YOLO series model),support deployment among server, mobile, embedded and IoT devices)

    Hi all,

    code:https://github.com/PaddlePaddle/FastDeploy

    I am glad to share that my team are working on an open source repository FastDeploy , which provides an an Easy-to-use and high performance AI model deployment toolkit for Cloud and Edge with out-of-the-box and unified experience(including deployment on x86 CPU、NVIDIA GPU、ARM CPU、Graphcore IPU、XPU、NPU etc.) .As there are more and more types of AI models and more and more types of AI hardware, AI engineers urgently need to deploy unified AI to reduce the implementation cycle and difficulty of AI deployment. Our original intention for this product is to train AI models once and deploy AI models anywhere.

    Server-side and Cloud Model List

    At present, We hope that more people can benefit from the project.

    Thank you and looking forward.

    FastDeploy R&D Team.

    submitted by /u/Putrid-Snow1185
    [link] [comments]

    Source link

    Click here to read more

    [R] Neurosymbolic Programming for Science

    [R] Neurosymbolic Programming for Science

    Neurosymbolic Programming (NP) techniques have the potential to accelerate scientific discovery. These models combine neural and symbolic components to learn complex patterns and representations from data, using high-level concepts or known constraints. NP techniques can interface with symbolic domain knowledge from scientists, such as prior knowledge and experimental context, to produce interpretable outputs. We identify opportunities and challenges between current NP models and scientific workflows, with real-world examples from behavior analysis in science: to enable the use of NP broadly for workflows across the natural and social sciences.

    https://preview.redd.it/kkzt8t6f8zy91.png?width=2258&format=png&auto=webp&s=d808675ecd7837425fb12a92009dd0d3e0fa5f68

    Paper: https://arxiv.org/abs/2210.05050

    submitted by /u/insider_7
    [link] [comments]

    Source link

    Click here to read more

%d bloggers like this: