Top new questions this week:
|
Graph Neural Networks power is limited by the power of Weisfeiler–Lehman Graph Isomorphism algorithm. Quoting wikipedia: It has been demonstrated that GNNs cannot be more expressive than the …
|
In Spinning Up by OpenAI, it says the following regarding policy optimization methods and Q-Learning as ways of getting a good policy for RL. Trade-offs Between Policy Optimization and Q-Learning. …
|
The seminal Attention is all you need paper introduces Transformers and implements the attention mecanism with “queries, keys, values”, in an analogy to a retrieval system. I understand the …
|
In the original paper on dropout, in section 7.3.2, we see that while keeping $pn$ constant, we get a (test) error increase by decreasing retainment below 0.6. Why would that happen? If $pn$ is …
|
You may not believe it, but I am an ANN expert. Perhaps, for that reason, I am unable to grasp completely what the layers are in a Deep Forward Artificial Neural Network (DFANN). According to the Deep …
|
I wonder if machine learning has ever been applied to space-time diagrams of cellular automata. What comprises a training set seems clear: a number of space-time diagrams of one or several (elementary)…
|
Hello 🙂 I’m required to write a document where I describe what DQL does in short. This is what I wrote: DQL: instead of a Q-table, a DNN is used to approximate the Q-values for each action based on a …
|
Greatest hits from previous weeks:
|
As far as I can tell, BERT is a type of Transformer architecture. What I do not understand is: How is Bert different from the original transformer architecture? What tasks are better suited for BERT,…
|
As far as I can tell, neural networks have a fixed number of neurons in the input layer. If neural networks are used in a context like NLP, sentences or blocks of text of varying sizes are fed to a …
|
Is it possible to feed a neural network the output from a random number generator and expect it learn the hashing (or generator) function, so that it can predict what will be the next generated pseudo-…
|
For instance, the title of this paper reads: “Sample Efficient Actor-Critic with Experience Replay”. What is sample efficiency, and how can importance sampling be used to achieve it?
|
If the original purpose for developing AI was to help humans in some tasks and that purpose still holds, why should we care about its explainability? For example, in deep learning, as long as the …
|
I was surveying some literature related to Fully Convolutional Networks and came across the following phrase, A fully convolutional network is achieved by replacing the parameter-rich fully …
|
Suppose that I have 10K images of sizes $2400 \times 2400$ to train a CNN. How do I handle such large image sizes without downsampling? Here are a few more specific questions. Are there any …
|
Can you answer this question?
|
Suppose we have the following neural network (in reality it is a CNN with 60k parameters): This image, as well as the terminology used here, is borrowed from Matt Mazur As is visible, there are two …
|